So far we have considered the curvature of lines in three dimensions. We now upgrade the discussion to deal with the curvature of twodimensional surfaces embedded in flat three-dimensional space. For onedimensional surfaces (i.e. curves) we started by comparing the behaviour near closely spaced points to that of points on a circle (and then, subsequently, on a parabola). For the two-dimensional surface, we might imagine a sphere would be the measure of curvature. However, we can't say that the curvature will be the same in two different directions, since this is generally not true. In fact, we must evaluate two curvature constants, kappa_(1)\kappa_{1} and kappa_(2)\kappa_{2}, to characterize the curvature of a plane. It seems like a tough problem to determine these two constants, since we don't know in which directions to evaluate the curvature. The task is dramatically simplified by a discovery by Leonhard Euler. ^(10){ }^{10} If the two curvature constants aren't equal then there will be some direction where the curvature with be a minimum and another where it is a maximum. Euler's discovery was that these directions are perpendicular. As a result, just as the curvature of a line is measured by the radius of a circle, the curvature of the two-dimensional plane is measured using the major and minor axes of an ellipse.
We start with the coordinate approach, where we measure curvature by comparing the height of a curved surface above a plane which forms the tangent to the surface at some nearby point (Fig. 30.6). Taking our lead from our analysis of the curve, in two dimensions we have that the height zz of the curved surface, written in terms of coordinates x^(i)x^{i} in the tangent plane, is
where K_(ij)K_{i j} are the components of a 2xx22 \times 2 real symmetric tensor. This tensor is the key to understanding the classical curvature of curved surfaces since it defines the ellipse whose major and minor axes are the eigenvalues of K_(ij)K_{i j}. The inverse of these eigenvalues will give the two components of curvature, as we shall see below.
Example 30.5
Consider a surface embedded in three-dimensional space. The departure zz of the curved surface from the flat one can be expanded as
{:(30.18)z=(1)/(2)*ax^(2)+bxy+(1)/(2)cy^(2):}\begin{equation*}
z=\frac{1}{2} \cdot a x^{2}+b x y+\frac{1}{2} c y^{2} \tag{30.18}
\end{equation*}
or, equivalently,
{:(30.19)z=(1)/(2)([x,y])([a,b],[b,c])((x)/(y)).:}z=\frac{1}{2}\left(\begin{array}{ll}
x & y
\end{array}\right)\left(\begin{array}{ll}
a & b \tag{30.19}\\
b & c
\end{array}\right)\binom{x}{y} .
It is always possible make a coordinate transformation, effectively by rotating the axes
The new variables, xi\xi and eta\eta are the coordinates on the principal axes of the ellipse. In this context, the eigenvalues kappa_(i)\kappa_{i} are called the principal curvatures and can be written in terms of radii of curvature rho_(i)\rho_{i} as kappa_(1)=1//rho_(1)\kappa_{1}=1 / \rho_{1} and kappa_(2)=1//rho_(2)\kappa_{2}=1 / \rho_{2}. The principal axes give us the largest (major) and smallest (minor) widths of the ellipse, which provide our measure of curvature.
Let's apply this approach to a spherical surface of radius aa. For the surface balanced on a plane we can write z=-(a^(2)+x^(2)+y^(2))^((1)/(2))+az=-\left(a^{2}+x^{2}+y^{2}\right)^{\frac{1}{2}}+a. Near the point of contact, we have z~~(1)/(2a^(2))*(x^(2)+y^(2))z \approx \frac{1}{2 a^{2}} \cdot\left(x^{2}+y^{2}\right) and so we have two principal curvatures of 1//a1 / a, both with radius of curvature aa, just as we would expect. ^(11){ }^{11} Gauss' own version can be found in his 1827 paper Disquisitiones generales circa superficies curvas (General Investigations of Curved Surfaces). A translation of the paper along with a running lation of the paper along with a running
commentary given in terms of modern commentary given in terms of modern
mathematical notation can be found mathematical notation can be found
in Volume II of Michael Spivak's AA Comprehensive Introduction to Differential Geometry and comes highly recommended. ^(12){ }^{12} Modern English usage of the word egregious, which formerly meant 'remarkable, usually implies that something is remarkably bad. H. W. Fowler (1858-1933) notes that it is especially (1858-1933) notes that it is especially
applied to the nouns "ass, coxcomb, applied to the nouns "ass, coxcomb,
liar, imposter, folly, blunder, waste", liar, imposter, folly, blunder, waste",
and that "Reversion to the original sense ... is mere pedantry." ^(13){ }^{13} See the excellent book by Needham (2021) for more details and several proofs. Needham makes sense of the theorem which, in Gauss' original form (discussed here), admittedly looks like it was plucked magically from the air.
Here's a look at Gauss' remarkable result. ^(13){ }^{13} In two dimensions, the flat, Euclidean plane is described by a line element
Gauss assumed that in a sufficiently small (that is, infinitesimal) patch of a curved space, it would always be possible to describe the space using such a coordinate system. However, a general two-dimensional curved space will not be able to be described by the (flat-space) xi^(i)\xi^{i} coordinates over a finite neighbourhood. Instead, there will be a perfectly good coordinate system (x^(1),x^(2))\left(x^{1}, x^{2}\right) that does cover the curved space. In this latter coordinate system, the line element is written as (just as we've had previously)
As usual, the values of g_(ij)g_{i j} depend on the particular coordinate system chosen (although it will turn out that they also encode the intrinsic properties of the space). ^(14){ }^{14} This makes it similar to ss in the one-
dimensional case.
Gauss sought to define a measure of curvature in terms of a function of the components of the metric of g_(ij)g_{i j} and their derivatives that depends only on the intrinsic properties of the space and not on the particular coordinate system chosen. The properties of the space and not on the particular coordinate system chosen. The
formidable equation that Gauss discovered is (employing the comma notation for derivatives)
The significance of this is that someone in possession of the components of the metric also has a failsafe means of calculating the curvature of the surface that cannot be the consequence merely of a particular choice of coordinates. Gauss was, understandably, elated to have discovered this wonderful result.
As a concrete example, we have for the sphere that g_(theta theta)=a^(2)g_{\theta \theta}=a^{2} and g_(phi phi)=a^(2)sin^(2)thetag_{\phi \phi}=a^{2} \sin ^{2} \theta. We find that
which is to say that the curvature function is 1//(" radius ")^(2)1 /(\text { radius })^{2}, which is only given in terms of a coordinate of the space itself, its radius. It is also heartening that the flat, Euclidean plane gives
We will have more to say about Gaussian curvature but, happily won't have to make use of Gauss' remarkable equation as there are several alternative ways of extracting KK.
30.4 Gauss' equation
In one dimension, we considered lengths of lines, so it would seem natural to look at areas in two dimensions. In this spirit, we can define the curvature of a surface at a point P\mathcal{P} as
{:(30.29)K(P)=(dOmega)/((d)A)=lim_(Delta A rarr0)(((" dimensionless area swept out ")/(" on unit sphere by normals ")))/(((" corresponding area "Delta A)/(" on actual surface "))):}\begin{equation*}
K(\mathcal{P})=\frac{\mathrm{d} \Omega}{\mathrm{~d} A}=\lim _{\Delta A \rightarrow 0} \frac{\binom{\text { dimensionless area swept out }}{\text { on unit sphere by normals }}}{\binom{\text { corresponding area } \Delta A}{\text { on actual surface }}} \tag{30.29}
\end{equation*}
Turning to the description in terms of vectors, we define a twodimensional surface embedded in flat three-dimensional space by a vector-valued function X(x^(1),x^(2))\boldsymbol{X}\left(x^{1}, x^{2}\right), where the coordinates x_(1)x_{1} and x_(2)x_{2} parametrize the surface [i.e. the coordinate system (x^(1),x^(2))\left(x^{1}, x^{2}\right) lies within the surface {:^(14)]\left.{ }^{14}\right]. That is, given a point with coordinates in the surface of x^(1)x^{1} and x^(2)x^{2}, the vector-valued function X\boldsymbol{X} returns the vector from the origin to the surface at that point. The two-dimensional surface has two basis vectors, which are given by
Example 30.7
Let's now use this expression to derive the Gaussian curvature using these two basis vectors. The normal to the surface is n=e_(1)xxe_(2)n=e_{1} \times e_{2} and so the unit normal ^(15){ }^{15} is
We can now compute the curvature of a surface from knowledge of its basis vectors.
Although this is all very interesting, an approach even more closely based on modern differential geometry will be more useful to us. Consider the 3 -vector e_(nu)\boldsymbol{e}_{\nu} and its derivative (dele_(nu))/(delx^(mu))\frac{\partial \boldsymbol{e}_{\nu}}{\partial x^{\mu}}. In general, we can choose to write the components of the derivative in the form
This has a part expressed in terms on vectors confined to the surface (the first term on the right) and a part expressed in terms of the bit sticking out of the surface (the second term). Since hat(n)* hat(n)=1\hat{\boldsymbol{n}} \cdot \hat{\boldsymbol{n}}=1 and hat(n)*e_(nu)=0\hat{\boldsymbol{n}} \cdot \boldsymbol{e}_{\nu}=0, we also find an expression for the matrix K_(mu nu)K_{\mu \nu}
which is known ^(16){ }^{16} as Gauss' equation. The matrix K_(mu nu)K_{\mu \nu} is designed to look like the Gaussian curvature function K(x_(1),x_(2))K\left(x_{1}, x_{2}\right), but it is not yet clear how. We look into this in the next section.
Example 30.8
Let's compute the components of K_(mu nu)K_{\mu \nu} for a cylinder. The points on a cylinder's surface are described by vectors, with Cartesian components given by
{:(30.38)X(theta","z)=([a cos theta],[a sin theta],[z]):}\boldsymbol{X}(\theta, z)=\left(\begin{array}{c}
a \cos \theta \tag{30.38}\\
a \sin \theta \\
z
\end{array}\right)
^(15){ }^{15} It is the unit normal that is the useful quantity in this context, so we divide by the magnitude n=e_(1)xxe_(2)∣n=e_{1} \times e_{2} \mid. There are several cases below where we differentiate hat(n)\hat{\boldsymbol{n}}. However, although |n||\boldsymbol{n}| can depend on the coordinates, we don't differentiate this normalization factor. Rather we scale n\boldsymbol{n} by a factor 1//|n|1 /|\boldsymbol{n}| at some position to make it a unit vector, and then treat the factor as a constant. ^(16){ }^{16} Gauss came up with a lot of equations, and so this name doesn't uniquely identify it! ^(17){ }^{17} For later use, we also have metric components g_(mu nu)=e_(mu)*e_(nu)g_{\mu \nu}=\boldsymbol{e}_{\mu} \cdot \boldsymbol{e}_{\nu} of g_(theta theta)=a^(2)g_{\theta \theta}=a^{2} and g_(zz)=1g_{z z}=1
Fig. 30.7 A curve, passing through point P\mathcal{P}, embedded in a twodimensional surface. ^(18){ }^{18} Some terminology in classical geometry includes the first fundamental form, defined as
This is simply ds^(2)\mathrm{d} s^{2} that we've written before. The second fundamental form is found by considering kappa_(n)=\kappa_{n}=K_(mu nu)t^(mu)t^(nu)=K_(mu nu)(dX^(mu)//ds)(dX^(nu)//ds)K_{\mu \nu} t^{\mu} t^{\nu}=K_{\mu \nu}\left(\mathrm{d} X^{\mu} / \mathrm{d} s\right)\left(\mathrm{d} X^{\nu} / \mathrm{d} s\right), and removing the path elements ds\mathrm{d} s to write
The basis vectors and unit normal are found to be ^(17){ }^{17}
{:(30.39)e_(theta)=([-a sin theta],[a cos theta],[0])","quade_(z)=([0],[0],[1])","quad hat(n)=([cos theta],[sin theta],[0]):}\boldsymbol{e}_{\theta}=\left(\begin{array}{c}
-a \sin \theta \tag{30.39}\\
a \cos \theta \\
0
\end{array}\right), \quad \boldsymbol{e}_{z}=\left(\begin{array}{l}
0 \\
0 \\
1
\end{array}\right), \quad \hat{\boldsymbol{n}}=\left(\begin{array}{c}
\cos \theta \\
\sin \theta \\
0
\end{array}\right)
Differentiating, the only non-zero component results from
{:(30.40)(dele_(theta))/(del theta)=([-a cos theta],[-a sin theta],[0]):}\frac{\partial \boldsymbol{e}_{\theta}}{\partial \theta}=\left(\begin{array}{c}
-a \cos \theta \tag{30.40}\\
-a \sin \theta \\
0
\end{array}\right)
and dotting this with n\boldsymbol{n} yields -a-a. We conclude K_(theta theta)=-aK_{\theta \theta}=-a and all other components of K_(mu nu)K_{\mu \nu} vanish.
30.5 Intrinsic and extrinsic curvature
We can now find out how the symmetric matrix K_(mu nu)=(de_(nu))/(dx^(mu))* hat(n)K_{\mu \nu}=\frac{\mathrm{d} \boldsymbol{e}_{\nu}}{\mathrm{d} x^{\mu}} \cdot \hat{\boldsymbol{n}}, that we met in the last section, relates to the Gaussian curvature. Consider a curve lying in the surface that passes through a point P\mathcal{P} (Fig. 30.7), where it has tangent vector tt. As usual, we parametrize the curve with the arc-length parameter ss, but we note for later that we also have access to the coordinates x^(mu)x^{\mu} that lie within the surface. Differentiating the tangent vector along the curve, we have
where e\boldsymbol{e} is some unit vector formed from a linear combination of e_(1)\boldsymbol{e}_{1} and e_(2)\boldsymbol{e}_{2}, and is, as such, perpendicular to hat(n)\hat{\boldsymbol{n}}.
Now consider the derivative (d)/(ds)\frac{\mathrm{d}}{\mathrm{d} s}. When represented in terms of the coordinates in the surface, this is
Therefore, a term t^(mu)(de_(mu))/(ds)t^{\mu} \frac{\mathrm{d} \boldsymbol{e}_{\mu}}{\mathrm{d} s}. can be written as t^(mu)t^(nu)(dele_(mu))/(delx^(nu))t^{\mu} t^{\nu} \frac{\partial \boldsymbol{e}_{\mu}}{\partial x^{\nu}}. As a result, when we project eqn 30.42 along hat(n)\hat{\boldsymbol{n}} we can extract
where we've used that hat(n)*e=0\hat{\boldsymbol{n}} \cdot \boldsymbol{e}=0 and hat(n)* hat(n)=1\hat{\boldsymbol{n}} \cdot \hat{\boldsymbol{n}}=1. Finally, recall from Gauss' equation (dele_(mu))/(delx^(nu))* hat(n)=K_(nu mu)=K_(mu nu)\frac{\partial \boldsymbol{e}_{\mu}}{\partial x^{\nu}} \cdot \hat{\boldsymbol{n}}=K_{\nu \mu}=K_{\mu \nu}, which says that the part of dele_(mu)//delx^(nu)\partial \boldsymbol{e}_{\mu} / \partial x^{\nu} pointing along hat(n)\hat{\boldsymbol{n}} is K_(mu nu)K_{\mu \nu} and so ^(18){ }^{18}
To relate this expression to curvature, we will follow Gauss and look not just at this one curve, but at all of the curves passing through P\mathcal{P}. For each, we calculate kappa_(n)\kappa_{n} and then find the extremal value, i.e. the largest and smallest values of kappa_(n)\kappa_{n}.
Example 30.9
To do this, we extremize the quantity K_(mu nu)t^(mu)t^(nu)K_{\mu \nu} t^{\mu} t^{\nu} subject to the constraint that the tangent vectors are properly normalized to unity, written ^(19){ }^{19} as g_(mu nu)t^(mu)t^(nu)=1g_{\mu \nu} t^{\mu} t^{\nu}=1. We can carry out this procedure using a Lagrange multiplier kk. So, following the usual procedure, we calculate
This is an eigenvalue equation for the matrix A_\underline{\boldsymbol{A}} with components ^(20)A^(lambda)_(nu)=g^(lambda mu)K_(mu nu){ }^{20} A^{\lambda}{ }_{\nu}=g^{\lambda \mu} K_{\mu \nu}. It is the eigenvalues k_(1)k_{1} and k_(2)k_{2} of A_\underline{\boldsymbol{A}} that will be important. To see how, we multiply through by t_(lambda)=g_(lambda sigma)t^(sigma)t_{\lambda}=g_{\lambda \sigma} t^{\sigma} and contract the index lambda\lambda to obtain
and so
k=K_(mu nu)t^(mu)t^(nu)=kappa_(n)^("extremal. ")k=K_{\mu \nu} t^{\mu} t^{\nu}=\kappa_{n}^{\text {extremal. }}
We conclude that the two eigenvalues k_(1)k_{1} and k_(2)k_{2} of the matrix A_\underline{\boldsymbol{A}} with components g^(lambda mu)K_(mu nu)g^{\lambda \mu} K_{\mu \nu} give the extremal values of the curvature kappa_(n)^("extremal ")\kappa_{n}^{\text {extremal }}.
The approach from the last example gives us access to two invariants of the matrix A_\underline{\boldsymbol{A}} that characterize the curvature: (i) the determinant detA_=detK_//detg_=k_(1)k_(2)\operatorname{det} \underline{\boldsymbol{A}}=\operatorname{det} \underline{\boldsymbol{K}} / \operatorname{det} \underline{\boldsymbol{g}}=k_{1} k_{2} and (ii) the trace TrA_=Tr(g^(-1)K_)=\operatorname{Tr} \underline{\boldsymbol{A}}=\operatorname{Tr}\left(\boldsymbol{g}^{-1} \underline{\boldsymbol{K}}\right)=k_(1)+ bar(k)_(2)k_{1}+\bar{k}_{2}. It turns out that the Gaussian curvature or intrinsic curvature that we previously called KK is given by the determinant k_(1)k_(2)k_{1} k_{2}, while the trace (k_(1)+k_(2))\left(k_{1}+k_{2}\right) gives a quantity called the extrinsic curvature. What do these terms mean? A good example is a cylinder (see Fig. 30.8) which has extrinsic curvature by virtue of the way in which the twodimensional surface is embedded in three-dimensional space. However, it is not intrinsically curved because that surface can easily be unwrapped and placed on a flat surface (as shown in the figure).
Unlike the cylinder (see Fig. 30.8), a sphere cannot be made from a flat piece of paper (at least, not without cutting the paper) and so possesses both intrinsic and extrinsic curvature. The cylinder, on the other other hand, has extrinsic curvature but zero intrinsic curvature. It is therefore intrinsic curvature, that cannot be removed by unwrapping the surface without cutting the paper. ^(21){ }^{21}
Example 30.10
In this example, we evaluate the intrinsic and extrinsic curvature of two surfaces embedded in three-dimensional space.
(a) Consider a parabolic surface. This is described by coordinates (x^(1),x^(2))=(x,y)\left(x^{1}, x^{2}\right)=(x, y) with
{:(30.56)X=([x],[y],[(1)/(2)ux^(2)+(1)/(2)vy^(2)]):}\boldsymbol{X}=\left(\begin{array}{c}
x \tag{30.56}\\
y \\
\frac{1}{2} u x^{2}+\frac{1}{2} v y^{2}
\end{array}\right)
^(19){ }^{19} Note how the metric is needed here as the surface is, in general, curved. ^(20){ }^{20} Example: For the cylinder in Example 30.8 we write a matrix with components K_(mu nu)K_{\mu \nu}
{:(30.51)K_=([-a,0],[0,0]):}\underline{K}=\left(\begin{array}{cc}
-a & 0 \tag{30.51}\\
0 & 0
\end{array}\right)
and a matrix with components g^(lambda mu)g^{\lambda \mu}
Fig. 30.8 A cylinder (the tin can) looks curved, but it is not intrinsically curved but only curved extrinsically (by the way it is embedded in three-dimensional space). The cylindrical surface can be unwound and placed cal surface can be unwound and placed
on a flat surface without tearing or dison a flat surface without tearing or dis-
torting. For a cylinder of radius aa, the eigenvalues of the matrix A_\underline{\boldsymbol{A}} are k_(1)=-1//ak_{1}=-1 / a and k_(2)=0k_{2}=0, so we have zero intrinsic curvature K=k_(1)k_(2)=0K=k_{1} k_{2}=0, even though the extrinsic curvature is non-zero ( k_(1)+k_(2)=-1//ak_{1}+k_{2}=-1 / a ). ^(21){ }^{21} For one-dimensional curves, a notion of intrinsic curvature doesn't really make sense, since the curve can always be flattened out; however curved around your shoelaces are, you can always unthread them and lay them out in straight lines. Thus, onedimensional curves only possess extrinsic curvature. ^(22){ }^{22} Recall that the metric tensor g\boldsymbol{g} has components
{:(30.60)g_(mu nu)=([a^(2),0],[0,a^(2)sin^(2)theta]):}g_{\mu \nu}=\left(\begin{array}{cc}
a^{2} & 0 \tag{30.60}\\
0 & a^{2} \sin ^{2} \theta
\end{array}\right) so that det g=a^(4)sin^(2)theta\operatorname{det} \boldsymbol{g}=a^{4} \sin ^{2} \theta. The inverse metric has components
^(23){ }^{23} Riemann sets out his aims in his inaugural lecture delivered at the University of Göttingen in 1853. The lecture was intended to be accessible to the entire faculty and so contains minimal mathematical detail. It's immense significance is explained in Spivak, Vol. II ^(24){ }^{24} Why? Since g_(mu nu)=g_(nu mu)g_{\mu \nu}=g_{\nu \mu}, then the independent components (i) are the diagonal elements (of which there are nn ) and (ii) half of the off-diagonal el ements (and there are (n^(2)-n)\left(n^{2}-n\right) offdiagonal elements). Adding these we have n+(n^(2)-n)//2=n(n+1)//2n+\left(n^{2}-n\right) / 2=n(n+1) / 2, as claimed. ^(25){ }^{25} The function QQ is a function of 2n2 n variables which take the form of two vectors X\boldsymbol{X} and Y\boldsymbol{Y}. We assume these two vectors span the two-dimensional space WW and simply write Q(W)Q(W) for simplicity here.
The metric is then described by the matrix
{:(30.57)g_(mu nu)=([1+u^(2)x^(2),uvxy],[uvxy,1+v^(2)y^(2)]):}g_{\mu \nu}=\left(\begin{array}{cc}
1+u^{2} x^{2} & u v x y \tag{30.57}\\
u v x y & 1+v^{2} y^{2}
\end{array}\right)
We have the basis vectors and unit normal
{:(30.58)e_(x)=([1],[0],[ux])","quade_(y)=([0],[1],[vy])","quad hat(n)=(1)/(|n|)([-ux],[-vy],[1]):}\boldsymbol{e}_{x}=\left(\begin{array}{c}
1 \tag{30.58}\\
0 \\
u x
\end{array}\right), \quad \boldsymbol{e}_{y}=\left(\begin{array}{c}
0 \\
1 \\
v y
\end{array}\right), \quad \hat{\boldsymbol{n}}=\frac{1}{|\boldsymbol{n}|}\left(\begin{array}{c}
-u x \\
-v y \\
1
\end{array}\right)
where |n|=(1+u^(2)x^(2)+v^(2)y^(2))^((1)/(2))|\boldsymbol{n}|=\left(1+u^{2} x^{2}+v^{2} y^{2}\right)^{\frac{1}{2}}. Taking derivatives yields
with other derivatives vanishing. Let's concentrate on the curvature evaluated at the origin, where we have that g_(uu)=g_(vv)=1,g_(uv)=g_(vu)=0,K_(11)=ug_{u u}=g_{v v}=1, g_{u v}=g_{v u}=0, K_{11}=u and K_(22)=vK_{22}=v. We have that the intrinsic curvature is K=uvK=u v and the extrinsic curvature is u+vu+v. (b) Next, consider the spherical surface with coordinates (x^(1),x^(2))=(theta,phi)\left(x^{1}, x^{2}\right)=(\theta, \phi). The surface is written as ^(22){ }^{22}
{:(30.62)X=([a sin theta cos phi],[a sin theta sin phi],[a cos theta]):}\boldsymbol{X}=\left(\begin{array}{c}
a \sin \theta \cos \phi \tag{30.62}\\
a \sin \theta \sin \phi \\
a \cos \theta
\end{array}\right)
The basis vectors and unit normal follow as
e_(theta)=([a cos theta cos phi],[a cos theta sin phi],[-a sin theta]),quade_(phi)=([-a sin theta sin phi],[a sin theta cos phi],[0]),quad hat(n)=([sin theta cos phi],[sin theta sin phi],[cos theta])\boldsymbol{e}_{\theta}=\left(\begin{array}{c}
a \cos \theta \cos \phi \\
a \cos \theta \sin \phi \\
-a \sin \theta
\end{array}\right), \quad \boldsymbol{e}_{\phi}=\left(\begin{array}{c}
-a \sin \theta \sin \phi \\
a \sin \theta \cos \phi \\
0
\end{array}\right), \quad \hat{\boldsymbol{n}}=\left(\begin{array}{c}
\sin \theta \cos \phi \\
\sin \theta \sin \phi \\
\cos \theta
\end{array}\right)
We have non-zero components K_(11)=-aK_{11}=-a and K_(22)=-asin^(2)thetaK_{22}=-a \sin ^{2} \theta. We find an intrinsic curvature detK_//detg_=1//a^(2)\operatorname{det} \underline{\boldsymbol{K}} / \operatorname{det} \underline{\boldsymbol{g}}=1 / a^{2} and an extrinsic curvature Tr(g_^(-1)K_)=-2//a\operatorname{Tr}\left(\underline{\boldsymbol{g}}^{-1} \underline{\boldsymbol{K}}\right)=-2 / a.
30.6 Riemann's project
Bernhard Riemann aimed to generalize Gauss' remarkable result so that any observer confined to an nn-dimensional space would be able to evaluate the intrinsic curvature. ^(23){ }^{23} The success of his approach laid the foundations for general relativity.
Broadly speaking, Riemann's method was to ask: when does the introduction of a new coordinate system change a metric g\boldsymbol{g} with components g_(mu nu)g_{\mu \nu}, defined in nn-dimensional space, into some other metric a\boldsymbol{a} with components a_(mu nu)a_{\mu \nu} ? Riemann argued that the metric g\boldsymbol{g} is determined by n(n+1)//2n(n+1) / 2 functions ^(24){ }^{24} but that a new coordinate system can be defined using nn functions. As a result, the new metric must be determined by n(n+1)//2-n=n(n-1)//2n(n+1) / 2-n=n(n-1) / 2 functions. Riemann then claimed that there is a quadratic function ^(25)Q{ }^{25} Q that uniquely assigns a number to a two-dimensional space WW and that (i) for a two-dimensional manifold -3Q(W)-3 Q(W) is the Gaussian curvature; (ii) that for higher dimensional manifolds, -3Q(W)-3 Q(W) describes the Gaussian curvature of the two-dimensional subspace WW of the manifold; and (iii) there are n(n-1)//2n(n-1) / 2 independent
two-dimensional subspaces for an nn-dimensional vector space; if Q(W)Q(W) is known for each of these, then the metric is completely determined. This all seems a little obscure, but was shortly to be reformulated into something more familiar.
Some eight years later, in 1861, Riemann submitted a paper to the Paris Academy as an entry to a competition aiming to provide the answer to a question in the problem of heat conduction. ^(26){ }^{26} The paper contains the first instance of (what we would now call) the Riemann curvature tensor R(,\boldsymbol{R}(,, , ) in terms of the components of the metric. Riemann's paper asked what conditions make a space flat and gave the answer that what is needed is that the components of R\boldsymbol{R} should vanish. The number Q(W)Q(W) turns out to be proportional to the output of the (0,4)(0,4) version of the tensor R\boldsymbol{R} when its slots are filled by linearly independent vectors X\boldsymbol{X} and Y\boldsymbol{Y} that span the two-dimensional subspace W.^(27)W .{ }^{27} The significance of this is that the curvature of space determines the metric.
We shall not examine the details of Riemann's method in any more detail. ^(28){ }^{28} The important point is that Riemann's curvature tensor, which can be written in terms of the components of the metric only, not only contains the Gaussian curvature as a special case in two dimensions but generalizes a notion of curvature to higher dimensional cases. The promised link in two dimensions between the Gaussian curvature KK and the Riemann tensor is
where g=detg_(mu nu)g=\operatorname{det} g_{\mu \nu}, the determinant of the metric tensor g\boldsymbol{g}.
Example 30.11
We can now justify the link between the Riemann tensor and the Gaussian curvature. The simplest route is to use eqn 30.65 to write 4gR_(1212)=4g^(2)K4 g R_{1212}=4 g^{2} K. We then make use of the relationship from eqn 11.22 between the coordinates of R\boldsymbol{R} and the metric
then, put all together with eqn 30.25 allows one to verify the claim. Try it and see!
This completes our review of classical curvature. In the next chapter, we shall begin to examine the more modern approach to geometry that most naturally fits with the physics of general relativity. ^(26){ }^{26} Riemann, whose health was poor at the time, didn't win the prize owing to the lack of detail in his paper. Nobody else won the prize and it was withdrawn in 1868. ^(27){ }^{27} The explicit equation is 3Q(X,Y)=3 Q(\boldsymbol{X}, \boldsymbol{Y})=-R(X,Y,X,Y)-\boldsymbol{R}(\boldsymbol{X}, \boldsymbol{Y}, \boldsymbol{X}, \boldsymbol{Y}), where R\boldsymbol{R} is the (0,4)(0,4) version of the Riemann tensor, with components R_(mu nu alpha beta)R_{\mu \nu \alpha \beta}. ^(28){ }^{28} The interested reader should consult the masterful discussion in Spivak, Vol. the 1.
Chapter summary
The curvature of a line can be described in differential geometry using the Frenet-Serret equations.
Gaussian curvature allows the measurement of the intrinsic curvature of a two-dimensional surface, using measurements made entirely within that surface.
Riemann's result upgrades the Gaussian curvature to higher dimensions.
Exercises
(30.1) A mechanical definition of the curvature of a curve is possible if we imagine a unit mass traversing the curve (confined to a plane) at unit speed. The curvature is the force perpendicular to the curve required to maintain this motion. Prove that this definition is equivalent to that used in this chapter.
(30.2) Another method to compute Gaussian curvature is to use parallel transport to carry a vector around a loop on a surface. We then define
{:(30.70)((" Gaussian ")/(" curvature "))=(((" angle vector ")/(" turns through ")))/(((" area ")/(" of loop "))):}\begin{equation*}
\binom{\text { Gaussian }}{\text { curvature }}=\frac{\binom{\text { angle vector }}{\text { turns through }}}{\binom{\text { area }}{\text { of loop }}} \tag{30.70}
\end{equation*}
(30.3) We first met the Bertand-Diquet-Puiseux theorem in Exercise 3.3. Here we use it again.
Another way of describing Gaussian curvature is to use the (normalized) difference between the circumference of a circle in the plane and a circle on the surface. The circumference of the circular locus of points a distance epsilon\epsilon from a point differs from the Euclidean value 2pi epsilon2 \pi \epsilon by a correction factor. The curvature is defined as
represents the sum of the eigenvalues.
(30.5) (a) Consider a curve embedded in a surface. Differentiate the equation t* hat(n)=0\boldsymbol{t} \cdot \hat{\boldsymbol{n}}=0 along the curve and show that
as we had before.
(30.6) (a) Verify the result for the parabolic surface in Example 30.10 using Gauss's equation (eqn 30.37) to compute the components K_(mu nu)K_{\mu \nu} for a general point (i.e. not the origin).
(b) Show that the same result is found using Weingarten's equation from the previous problem.
(c) Verify the result is obtained for the intrinsic curvature KK using eqn 30.35 .
(30.7) Consider a three-dimensional spacelike hypersurface Sigma\Sigma with a timelike unit-normal n\boldsymbol{n}. Construct a (1,1)(1,1) projection operator P\boldsymbol{P} with components
(a) Evaluate P^(mu)_(alpha)P^(alpha)_(nu)P^{\mu}{ }_{\alpha} P^{\alpha}{ }_{\nu}.
(b) Show that the operator P_(nu)^(mu)P_{\nu}^{\mu} acts on vector X\boldsymbol{X} to project a vector X\boldsymbol{X} that is tangent to the surface.
(c) Define an induced metric hh via
Show that (i) h_(alpha beta)=g_(alpha beta)+n_(alpha)n_(beta)h_{\alpha \beta}=g_{\alpha \beta}+n_{\alpha} n_{\beta} and (ii) h_(alpha beta) bar(X)^(alpha) bar(Y)^(beta)=g_(alpha beta) bar(X)^(alpha) bar(Y)^(beta)h_{\alpha \beta} \bar{X}^{\alpha} \bar{Y}^{\beta}=g_{\alpha \beta} \bar{X}^{\alpha} \bar{Y}^{\beta}.
(30.8) Now define an induced covariant derivative DD, whose components are given via
where, now, the vector field X\boldsymbol{X} is restricted to be tangent to Sigma\Sigma. The double projection, as opposed to the single projection P^(beta)_(mu)(grad_(beta)X)^(alpha)P^{\beta}{ }_{\mu}\left(\boldsymbol{\nabla}_{\beta} \boldsymbol{X}\right)^{\alpha}, is designed to output a derivative as a vector that is within Sigma\Sigma, cutting off the normal part.
(a) By generalizing this definition, show that (D_(mu)h)_(alpha beta)=0\left(\boldsymbol{D}_{\mu} \boldsymbol{h}\right)_{\alpha \beta}=0.
(b) Show that, since X\boldsymbol{X} is tangent to Sigma\Sigma, we have P_(nu)^(mu)X^(nu)=X^(mu)P_{\nu}^{\mu} X^{\nu}=X^{\mu}.
(c) Starting from the fact that n_(mu)X^(mu)=0n_{\mu} X^{\mu}=0, show further that
(d) Show finally that, for vector field Y\boldsymbol{Y} that is also tangent to Sigma\Sigma, we have
{:(30.88)grad_(Y)X=D_(Y)X+K(X","Y)n:}\begin{equation*}
\nabla_{Y} X=D_{Y} X+K(X, Y) n \tag{30.88}
\end{equation*}
Hint: The first term is clearly the part tangent to Sigma\Sigma. To obtain the second term, consider the normal component of P^(mu)_(alpha)(grad_(mu)X)^(beta)P^{\mu}{ }_{\alpha}\left(\boldsymbol{\nabla}_{\mu} \boldsymbol{X}\right)^{\beta}.
This formalism is useful in that we can use it to describe the Riemann tensor in terms of quantities projected into the hyperspace Sigma\Sigma. These are the Gauss equation
31.1 Old notions of vectors and gradients
31.2 Vectors and vector fields 324 31.3 Linear slot machines again
31.4 Tensors again
^(1){ }^{1} That is to say, we leave out the struc ture given by the metric, its vectorspace structure and also its topological structure. We shall gradually reintroduce these features in the following chapters.
Fig. 31.1 Left: some manifolds, smooth enough that the region around any point looks locally flat; right: some non-manifolds with points whose neighbourhood does not look smooth at any level of magnification.
A reintroduction to geometry
It is no matter what you teach them first, any more than what leg you shall put into your breeches first.
Samuel Johnson (1709-1784)
In this chapter, we once again meet the geometry of vectors and 1forms. Our goal is to more thoroughly define an approach that is free from the shackles of coordinates. This will hinge on the observation that each vector is equivalent to a derivative.
Although not essential for our purposes here, it is worth keeping the idea in mind that the arena in which we work in this chapter is a manifold. This is a space, often called M\mathcal{M}, that is smooth: locally resembling the smoothness of R^(n)\mathbb{R}^{n}, the usual Euclidean (i.e. flat) nn-dimensional space. It is this smoothness that characterizes a manifold, since we leave out the rest of the rich structure ^(1){ }^{1} of R^(n)\mathbb{R}^{n}. In this smooth space, there are points called things like P,Q,A\mathcal{P}, \mathcal{Q}, \mathcal{A} and B\mathcal{B}. We can cover patches of this space with coordinates, so that P\mathcal{P} can be described by a set of nn coordinates (x^(1)dotsx^(n))\left(x^{1} \ldots x^{n}\right), although this won't always be necessary. Curves are well-defined objects in our manifolds, parametrized by some quantity lambda\lambda that varies monotonically along the curve. Differentiation, which measures changes along curves, is also a well-defined operation.
Example 31.1
It is useful to consider which spaces are smooth enough to qualify as a manifold. Examples of spaces that are manifolds include all of the nn-dimensional Euclidean spaces called R^(n)\mathbb{R}^{n}, which include R^(1)\mathbb{R}^{1} (a line), R^(2)\mathbb{R}^{2} (a plane) etc. (These clearly are locally identical to R^(n)\mathbb{R}^{n}, since they are the spaces R^(n)\mathbb{R}^{n}.) Other good examples which are smooth enough to look locally flat are (i) the one-dimensional circle, called S^(1)S^{1}; (ii) the two-dimensional surface of a sphere, called S^(2)S^{2}; and (iii) the two-dimensional surface of a torus. Some of these are shown on the left-hand side of Fig. 31.1. Perhaps more illuminating are spaces that aren't smooth enough to qualify as manifolds. Examples of these include (i) a line that juts out of a plane or (ii) a double cone; (iii) a line with a kink; (iv) a line that crosses itself. All of these structures have a point which even when blown up to a very large size, will never look locally flat. These latter examples are shown on the right-hand side of Fig. 31.1.
Our first task in this chapter will be to identify vectors and 1 -forms. These don't actually live in the manifold M\mathcal{M} itself, but in related manifolds. Specifically, a vector defined at a point P\mathcal{P} lives a manifold called
a tangent space, while 1-forms live in a dual space. For now, we will keep things as simple as possible and put these subtleties aside, retaining only the notion of points in a smooth manifold M\mathcal{M} that, locally, looks like flat Euclidean space. So forget, temporarily, about the metric, and also the covariant derivatives and curvature tensors that the metric field can generate, as we look at how a smooth space, with a minimum of structure, can host a powerful geometry that will allow us many insights into the physics of relativity.
31.1 Old notions of vectors and gradients
We want to define vectors and describe surfaces in space. Let's review the old technology for doing this.
Example 31.2
The first time we meet vectors we are taught to picture them as a directed straight line linking two points. For example, the line from point A\mathcal{A} to point B\mathcal{B} (Fig. 31.2). We then write a vector in terms of coordinates as
or u=u^(alpha)e_(alpha)\boldsymbol{u}=u^{\alpha} \boldsymbol{e}_{\alpha}, where the e_(alpha)\boldsymbol{e}_{\alpha} make an appropriate basis, spanning the space in which we're working. Often we use an orthonormal basis such that e_(alpha)*e_(beta)=delta_(alpha beta)\boldsymbol{e}_{\alpha} \cdot \boldsymbol{e}_{\beta}=\delta_{\alpha \beta}.
If we are presented with a curved surface defined, in three spatial dimensions, by a function such as z=f(x,y)z=f(x, y), a particularly useful object is the gradient vector of the surface. We find this by rewriting ff as a function h(x,y,z)=0h(x, y, z)=0, whose gradient vector is
This defines ^(2){ }^{2} a vector pointing normal to the tangent plane of the surface. If we combine a vector v\boldsymbol{v} and the gradient vector using the dot product, we obtain a useful object: the directional derivative of hh, often denoted del_(v)h\partial_{\boldsymbol{v}} h, given by
which tells us the change in the function hh along the direction of the vector v\boldsymbol{v}. That is, the value of hh at the tip of the vector v\boldsymbol{v}, minus the value of hh at the base of v\boldsymbol{v}. The output of the directional derivative is a number.
For our purposes this old technology simply will not do. However, these ideas, when taken together with some new ones, allow us to tighten up the definitions and come up with a more useful geometry. For example, we will see how having 1 -forms allows us to abandon this rather forced (and non-covariant) notion of the gradient outputting a vector normal to a surface.
Fig. 31.2 A vector stretching between points A\mathcal{A} and B\mathcal{B}. ^(2){ }^{2} Interpreting this as a vector immediately looks wrong from our tensor conventions of summing over one up and down index, since it appears that we have components h_(,alpha)e_(alpha)h_{, \alpha} \boldsymbol{e}_{\alpha}. ^(3){ }^{3} In words, input a value of lambda\lambda and output a position P\mathcal{P} on the curve.
Fig. 31.3 A vector as a derivative of
Fig. 31.4 A vector as a derivative in a coordinate system. We use the relationship between the point and the coordinates and also between the coordinates and the parametrization of the curve These are linked by the chain rule. ^(4)Or{ }^{4} \mathrm{Or}, if you prefer, the tangent vector vv is a derivative d//dlambda\mathrm{d} / \mathrm{d} \lambda.
31.2 Vectors and vector fields
We often think of a vector as a directed line B-A\mathcal{B}-\mathcal{A}, joining two points A\mathcal{A} and B\mathcal{B} as in Fig. 31.2. Defining a vector in terms of two points is cumbersome. We would prefer to have the concept of a vector at a single point P\mathcal{P}. Let's parametrize a straight line with a parameter lambda\lambda by writing P(lambda)=A+lambda(B-A)\mathcal{P}(\lambda)=\mathcal{A}+\lambda(\mathcal{B}-\mathcal{A}). This allows us to extract the vector as the difference between tip and base via
This idea of a vector as equivalent to a derivative is the key one in this chapter.
Advancing beyond straight lines, we can define a curve in terms of a another parametrized path ^(3)P(lambda){ }^{3} \mathcal{P}(\lambda), as we have in Fig. 31.3. The derivative formulation of vectors in the previous equation allows us to define a tangent vector to this curve via
An interpretation of this expression, due to Elie Cartan, is that the vector represents the movement of the point P\mathcal{P} with the derivative evaluating the difference between the point at the tip of the tangent vector, and the point at the base of this vector. Of course, we don't only have access to the parameter lambda\lambda. We often describe curves in the manner shown in Fig. 31.4 using a coordinate system like x^(alpha)x^{\alpha}. Using these coordinates, the parametrized path is written as x^(alpha)(lambda)x^{\alpha}(\lambda), and we use the chain rule to say that our vector is expressed as
The next step is to realize that, instead of interpreting a vector as the movement of the point P\mathcal{P}, we could strip off the point to make a more general vector operator along the curve
where the brackets in v[\boldsymbol{v}[ ] indicate that we need to provide a point on the curve as an input to the vector operator. In other words, the tangent vector v\boldsymbol{v} at a point is identified with a derivative d//dlambda\mathrm{d} / \mathrm{d} \lambda at that point. ^(4){ }^{4} Therefore, we don't need to view the vector as an arrow representing the movement of a point, but rather as an object that is attached to a particular point on the curve. The vector will vary as you evaluate it at different points along the curve.
Finally, we return to our old notion of a vector, written as v=v^(alpha)e_(alpha)\boldsymbol{v}=v^{\alpha} \boldsymbol{e}_{\alpha}. Comparison with our new formulation of a vector in eqn 31.8 allows us to identify components and basis vectors as follows:
where we note that e_(alpha)\boldsymbol{e}_{\alpha} could be written as e_(alpha)[]\boldsymbol{e}_{\alpha}[], since this is the part in which we input points on the curve. ^(5){ }^{5} Mixing this notation, allows us to write our vector as
where we've introduced the directional derivative operator del_(v)\partial_{v} in the last step.
We therefore have a new definition of a vector as identified with a directional derivative operator. This is sloganized as every vector can be identified with a derivative. We summarize our progress in the next box. A vector v\boldsymbol{v} that is tangent to a curve P(lambda)\mathcal{P}(\lambda) can be written as a derivative operator
where lambda\lambda parametrizes the curve. The vector's components are v^(alpha)=v^{\alpha}=(dx^(alpha))/(dlambda)\frac{\mathrm{d} x^{\alpha}}{\mathrm{d} \lambda} and the basis vectors are e_(alpha)=(del)/(delx^(alpha))\boldsymbol{e}_{\alpha}=\frac{\partial}{\partial x^{\alpha}}.
The derivative in the definition of the vector could continue to be fed points like P\mathcal{P} on the curve as it was in eqn 31.6. However, its real value is that we can input functions into this vector operator. We shall write the action of a vector field on a function ^(6)f(x){ }^{6} f(x) as
This object evaluates the change in the function f(x)f(x) along the direction of the vector v\boldsymbol{v}, that is, the directional derivative. ^(7){ }^{7} So to recap:
Every vector is equivalent to a differential operator that inputs a function and outputs the directional derivative with respect to the vector. This idea that every vector is a derivative links the concepts of geometry and analysis, as shown in Fig. 31.5. This helps explain the power of these techniques in the study of general relativity.
We can show that tangent vectors defined via derivatives evaluated at a point P\mathcal{P}, give a vector space. This relies on the notion that lots of curves can pass through a given point P\mathcal{P}, allowing us to compare the different tangent vectors to each curve evaluated at P\mathcal{P} (Fig. 31.6).
Example 31.3
A vector u\boldsymbol{u}, tangent to the curve parametrized by mu\mu, is written in coordinate space
Consider adding two different vectors at the same point P\mathcal{P} (Fig. 31.6). In the derivative language, these vectors are tangent to two different curves at P\mathcal{P} : a curve x(lambda)x(\lambda) and a curve x(mu)x(\mu). We write a linear combination, in terms of constants aa and bb, as
^(5){ }^{5} In other words, the components tell us how the coordinates x^(alpha)x^{\alpha} change with the curve's parameter lambda\lambda; the basis vectors represent the rate of change with respect to our choice of coordinates. ^(6){ }^{6} We continue to use square brackets for this, and round brackets for the vector's slot that inputs a 1 -form. ^(7){ }^{7} It might sometimes help to continue to think of this as the value of the function at the tip of the vector, minus the value at the base of the vector.
Fig. 31.5 Links between geometry and analysis.
Fig. 31.6 Two of the many curves passing through point P\mathcal{P} parametrized by lambda\lambda and mu\mu, respectively. Their tangents at P\mathcal{P} are shown. ^(8){ }^{8} Roughly speaking, a vector space is characterized by linear combinations of characterized by linear combinations of
vectors and has the property of closure vectors and has the property of closure
(so that any linear combination of vectors belonging to the vector space can only produce a vector which is itself a member of the vector space). The defining features of a vector space include the existence of a zero vector and the existence of an inverse of any nonzero vector, and also properties such as associativity, distributivity and comas associativity, distributivity and commutativity. There must be a basis of linearly independent vectors that spans
the space, the number of which are the space, the number of which are
equal to the number of dimensions of the space. ^(9){ }^{9} We write this d//d lambda|_(P)d /\left.d \lambda\right|_{\mathcal{P}} to remove any ambiguity about which point we mean Note that we don't write this dP//dlambda\mathrm{d} \mathcal{P} / \mathrm{d} \lambda since we don't any longer need to input the point P\mathcal{P} into the derivative and generally, we won't. The vector will most often operate on a function to deliver the function's directional derivative along the vector. ^(10){ }^{10} These are important in physics. For example, we shall see in Chapter 39 how the congruence represents the streamlines of a fluid.
Fig. 31.7 A congruence of curves fills the space but the curves never intersect. Their tangent vectors form a vector field. ^(11){ }^{11} In the following chapters, we will see that there is another derivative that can also be used to compare vector fields at different points: this is the so-called Lie derivative.
where we've expanded both vectors in terms of our choice of coordinates x^(alpha)x^{\alpha} in the second step, so that they're referred to the same set of basis vectors e_(alpha)=del//delx^(alpha)\boldsymbol{e}_{\alpha}=\partial / \partial x^{\alpha}. The previous equation defines a new vector w\boldsymbol{w}, with components (a((d)x^(alpha))/(dlambda)+b((d)x^(alpha))/(dmu))\left(a \frac{\mathrm{~d} x^{\alpha}}{\mathrm{d} \lambda}+b \frac{\mathrm{~d} x^{\alpha}}{\mathrm{d} \mu}\right). This new vector must be tangent to some other curve through P\mathcal{P} with, say, a parameter phi\phi, so we write
or w=av+bu\boldsymbol{w}=a \boldsymbol{v}+b \boldsymbol{u}. This means that the directions dives for indeed space ^(8)^{8} at P\mathcal{P}. Why? It is because linear combinations of a set of vectors ( d//dlambda\mathrm{d} / \mathrm{d} \lambda and d//dmu\mathrm{d} / \mathrm{d} \mu here) at P\mathcal{P} can express the directional derivatives of other curves at this point.
To summarize, our new formulation of vectors as derivative operators has been based on evaluating tangent vectors along a curve at a point such as P\mathcal{P}. The vector v=d//dlambda|_(P)v=\mathrm{d} /\left.\mathrm{d} \lambda\right|_{\mathcal{P}} is tangent to the curve parametrized by lambda\lambda at point ^(9)P{ }^{9} \mathcal{P}. Many other curves pass through point P\mathcal{P} and we have seen how the vectors defined at this point make a vector space. We can express an arbitrary vector at P\mathcal{P} as a superposition of other vectors at P\mathcal{P}.
Now let's imagine there being a large family of curves that fill the space, but don't intersect, as shown in Fig. 31.7. Each curve is parametrized by parameter lambda\lambda. (However, a point such as lambda=2\lambda=2 should be expected to label points different distances along each curve.) Such a set of space-filling curves that never intersect each other is called a congruence. ^(10){ }^{10} The existence of a congruence means that at any point in space there is a curve with a unique derivative. Since derivatives are equivalent to tangent vectors, this gives us the notion of a vector field: at any point in a space we can identify a unique vector. At point Q\mathcal{Q}, for example, we call that vector d//d lambda|_(Q)d /\left.d \lambda\right|_{\mathcal{Q}}. It is the tangent to that curve from the congruence that exists at point Q\mathcal{Q}. Put the other way around, we can pick a point and the vector field will input this point and output a vector. The vector-field machine works out what the vector is by computing the tangent to that curve found in the congruence at that particular point.
In this formulation, it only makes sense to compare different vectors defined at a single point, such as Q\mathcal{Q}. There is no obvious way to compare vectors defined at another point P\mathcal{P} with those at Q\mathcal{Q}. In order to do this, we would need a way of moving vectors around, such that they can be compared at a single point. The method of moving vectors would require the extra information on how vectors change as they move throughout the space, in order to make the comparison a meaningful one. In the previous chapters, we saw that this requires the notion of the connection that led to the covariant derivative. The connection allowed us to formulate a notion of parallelism: how to move a vector through space such that the only changes would be due to how that space varied. ^(11){ }^{11} For now, we will continue to compare objects defined at only a single point.
31.3 Linear slot machines again
Next, we shall expand the menu of objects to which we have access by considering 1 -forms. These are members of a family of objects called 'differential forms' or simply 'forms' that will occupy us for the remainder of this part of the book. Forms were proposed (or discovered) by Elié Cartan in around 1900. While vectors have become an essential tool whose properties are taught to all undergraduate physicists, forms do not currently share this status. This is arguably a mistake, since they are no more complicated than vectors, and allow access to a far more simple and clear path to understanding geometry than is often presented.
We have, from our progress in this chapter, the notion of vector fields as derivatives that act on functions at a particular point in space. Quite separately from this, each vector has another possible input: vectors have a slot v()\boldsymbol{v}() that accepts a 1 -form and returns a scalar. Calling a typical 1-form tilde(sigma)\tilde{\boldsymbol{\sigma}}, we have v( tilde(sigma))-=(: tilde(sigma),v:)=\boldsymbol{v}(\tilde{\boldsymbol{\sigma}}) \equiv\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{v}\rangle= (number). ^(12){ }^{12}
Vectors, 1 -forms and numbers live, in a sense, in different spaces. We mentioned how vectors live in a tangent space and, by the same token, 1 -forms live in a dual space. In addition, scalar numbers live on the real line R^(1)\mathbb{R}^{1}, giving a situation shown in Fig. 31.8. The important thing about the inner product summarized in the last equation is that it allows us to map between these spaces.
A 1-form maps a vector onto a number; a vector maps a 1 -form onto a number.
This mapping onto a number (i.e. a point on the real line) is what we mean when we say that a vector is dual to a 1 -form. An important property of the mapping/inner product operation is linearity, which is to say
and similarly, (: tilde(sigma),(nv+mu):)=n(: tilde(sigma),v:)+m(: tilde(sigma),u:)\langle\tilde{\boldsymbol{\sigma}},(n \boldsymbol{v}+m \boldsymbol{u})\rangle=n\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{v}\rangle+m\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{u}\rangle.
We saw in Chapter 4 that while a vector can be thought of as a directed arrow, a 1 -form can the thought of an infinite set of equally spaced surfaces (Fig. 31.9). In this picture, the inner product of a vector and a 1 -form outputs a number that corresponds to the number of surfaces that the vector pierces.
Example 31.4
Consider a 1 -form tilde(A)\tilde{\boldsymbol{A}}. Working in Euclidean space we find the number of surfaces pierced by a unit vector in the xx-direction is (:( tilde(A)),e_(1):)=A_(1)\left\langle\tilde{\boldsymbol{A}}, \boldsymbol{e}_{1}\right\rangle=A_{1}. This implies that the spacing of the surfaces along this direction is 1//(A_(1))1 /\left(A_{1}\right). Multiplying the form by some factor FF causes the density of surfaces to be increased by the factor FF.
Just as a vector can be written in terms of components and basis vectors, a 1 -form tilde(sigma)\tilde{\boldsymbol{\sigma}} is written in terms of components sigma_(alpha)\sigma_{\alpha} and basis 1 -forms omega^(alpha)\boldsymbol{\omega}^{\alpha} as tilde(sigma)=sigma_(alpha)omega^(alpha)\tilde{\boldsymbol{\sigma}}=\sigma_{\alpha} \boldsymbol{\omega}^{\alpha}. To allow the mapping between vectors, 1 -forms and ^(12){ }^{12} We shall maintain the symmetry of the operation, so that the 1 -form has a slot in which we can insert a vector tilde(sigma)(v)=\tilde{\boldsymbol{\sigma}}(\boldsymbol{v})= (number), where, in all cases we shall examine, v( tilde(sigma))= tilde(sigma)(v)\boldsymbol{v}(\tilde{\boldsymbol{\sigma}})=\tilde{\boldsymbol{\sigma}}(\boldsymbol{v}). This is just how we define the inner product is just how we define the inner product between a vector and a 1-form. This is sometimes denoted by a dot, and when the dot would be confusing (as it usually denotes the scalar product of two vectors), more often by angle brackets tilde(sigma)*v-=(: tilde(sigma),v:)\tilde{\boldsymbol{\sigma}} \cdot \boldsymbol{v} \equiv\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{v}\rangle. In summary,
Fig. 31.8 Vectors live in a tangent space, 1 -forms in a dual space and numbers along the real line R^(1)\mathbb{R}^{1}. A vector maps a 1 -form onto a number; a 1 -form maps a vector on to a number
Fig. 31.9 The 1-form tilde(A)\tilde{\boldsymbol{A}} represented as a set of repeating surfaces. The inner product (: hat(A),X:)\langle\hat{\boldsymbol{A}}, \boldsymbol{X}\rangle tells us how many surfaces are pierced by vector X\boldsymbol{X}. ^(13){ }^{13} Recall that the differential df\mathrm{d} f of a function f(x,y)f(x, y) is written as
We use a bold d\boldsymbol{d} here to denote the
differential operation. The reason for this will be explained in a few Chapters' time. ^(14){ }^{14} As a result of this manipulation we are able to effectively retire the vector's square bracket slot and rely on the combination of the vector and 1 -form to make numbers via tensor operations.
numbers, we define an inner product between basis 1 -forms and basis vectors as
In the new philosophy of having a vector correspond to a derivative, we have basis vectors e_(mu)=(del)/(delx^(mu))\boldsymbol{e}_{\mu}=\frac{\partial}{\partial x^{\mu}}, where x^(mu)x^{\mu} are a set of coordinates. We might also ask ourselves whether the 1 -form corresponds to another operation? Of course it does: it is the operation that is dual to the operation of taking a derivative. Specifically, each basis 1-form corresponds to a differential ^(13){ }^{13}
The identification of the 1 -form as a differential allows us to reconcile the two sets of slots we have given the vector u\boldsymbol{u} : square brackets that accept a function and round ones that accept a 1 -form. We have that to operate on a function f(x)f(x) with the vector u[]\boldsymbol{u}[], to make u[f]\boldsymbol{u}[f], we can equivalently take an inner product between the vector u\boldsymbol{u} and the 1 -form df\boldsymbol{d} f, written as u(df)\boldsymbol{u}(\boldsymbol{d} f) or, more helpfully (:df,u:)\langle\boldsymbol{d} f, \boldsymbol{u}\rangle. This is to say that we write ^(14){ }^{14}
We conclude that the inner product gives the same directional derivative we had before (telling us how the function changes along the vector u\boldsymbol{u} ) and so, in terms of the directional derivative operator, we write
Finally, we note that we have been dealing with vectors and 1 -forms defined at the same point in space P\mathcal{P}. Just as we define a vector field we also define a 1 -form field. That is to say, at every point in space we have access to a unique 1 -form. This can be combined with the vector we obtain at this point from a vector field to make a number. The numbers evaluated at each point in space themselves form a scalar field.
31.4 Tensors again
Tensors can be regarded as generalized slot machines. ^(15){ }^{15} They are machines in which we insert combinations of vectors and 1-forms in order to return numbers. Generally, a valence (n,m)(n, m) tensor has slots for the insertion of nn lots of 1 -forms and mm lots of vectors, in order to return a number. We have the defining rule for a (n,m)(n, m) tensor that
{:(31.25)T( tilde(sigma)^(1),dots, tilde(sigma)^(n),v^(1),dots,v^(m))=(" number "):}\begin{equation*}
\boldsymbol{T}\left(\tilde{\boldsymbol{\sigma}}^{1}, \ldots, \tilde{\boldsymbol{\sigma}}^{n}, \boldsymbol{v}^{1}, \ldots, \boldsymbol{v}^{m}\right)=(\text { number }) \tag{31.25}
\end{equation*}
Tensors can be built using a tensor product. The tensor product ^(16){ }^{16}ox\otimes of two vectors gives us a (2,0)(2,0) tensor
This object has two slots into which 1-forms can be inserted. To find the components of the tensor (which are a set of numbers) insert basis 1 -forms into each of the slots.
Example 31.5
How do we extract the components of the (2,0) tensor T=u ox v\boldsymbol{T}=\boldsymbol{u} \otimes \boldsymbol{v} ? We just insert the basis 1 -forms into the slots. Here's how it goes.
This is sometimes called ds^(2)\boldsymbol{d} \boldsymbol{s}^{2}. With this the metric becomes a two-slot, (0,2)(0,2) object called g(\boldsymbol{g}(,),intowhichweinsertvectorstoreturnanumber.) , into which we insert vectors to return a number.
An arbitrary tensor (n,m)(n, m) tensor can be built from nn vectors and mm 1-forms
^(15){ }^{15} Tensor calculus was developed mainly by Gregorio Ricci-Curbastro (1853-1925). His most famous work (1853-1925). His most famous work
on the calculus of tensors was coon the calculus of tensors was co-
authored with his former student, authored with his former student,
Tullio Levi-Civita (1873-1941), and signed Gregorio Ricci. (Curiously, all of Ricci's other works feature his full name, Ricci-Curbastro.) This work Méthods de calcul différential absolu et leurs applications from 1901 greatly simplified the presentation of Riemannian geometry and gave us the version of geometry that Einstein used as the basis for his general relativity ^(16){ }^{16} Recall that ox\otimes simply tells us to retain the order of the slots. ^(17){ }^{17} The tensor product sign tells us to maintain the order of vectors parts and, separately, the 1 -form parts. However, it doesn't matter if you list the basis vectors first or the basis 1 -forms first as long as you know which is which. ^(18){ }^{18} Our presentation here follows Misner, Thorne and Wheeler, which can be consulted for further details.
To find the tensor's components by inserting nn basis 1 -forms and mm basis vectors:
We will frequently encounter tensors built using tensor products from a selection of vectors and 1-forms. One useful rule is that, although you can't alter the order of tensors or vectors amongst themselves, the 1 -form parts and vector parts commute. ^(17){ }^{17}
It should be no surprise that in the same way that we defined vector fields and 1-form fields, we can also define tensor fields. That is, at each point in space, we have a unique tensor that can be combined with vectors and 1 -forms found at that same point to make a number.
31.5 Examples of tensor operations
Once we have our tensors we can apply a range of tools to manipulate them. Some of the common operations are described in the example below. ^(18){ }^{18}
Example 31.7
Consider, for example, a tensor P\boldsymbol{P} of rank (0,3)(0,3), which we can write
Here is a list of things one can do with a tensor.
Take the gradient. The gradient operation grad\boldsymbol{\nabla} adds an extra (0,1)(0,1) slot to the tensor. Therefore, grad P\boldsymbol{\nabla} \boldsymbol{P} has four slots that accept vectors. We interpret this as meaning
{:(31.36)grad P(u","v","w","z)~~((P(u,v,w)" at tip of "z)/(-P(u,v,w)" at base of "z)).:}\begin{equation*}
\boldsymbol{\nabla} P(\boldsymbol{u}, \boldsymbol{v}, \boldsymbol{w}, \boldsymbol{z}) \approx\binom{\boldsymbol{P}(\boldsymbol{u}, \boldsymbol{v}, \boldsymbol{w}) \text { at tip of } \boldsymbol{z}}{-\boldsymbol{P}(\boldsymbol{u}, \boldsymbol{v}, \boldsymbol{w}) \text { at base of } \boldsymbol{z}} . \tag{31.36}
\end{equation*}
In flat space, we write grad P(u,v,w,z)\boldsymbol{\nabla} \boldsymbol{P}(\boldsymbol{u}, \boldsymbol{v}, \boldsymbol{w}, \boldsymbol{z}) in component form as
Contraction reduces the rank of a tensor by two. We do this by inserting a basis vector and basis 1 -form and sum over corresponding components. Let's do this to the (1,3)(1,3) tensor S\boldsymbol{S}, defined as
The summing over a common up and down index is often called 'contracting an index'.
Divergence Contract the gradient with a slot on the tensor. Let's consider our (1,3)(1,3) tensor S\boldsymbol{S} again. We'll assume we're taking a divergence with respect to the first slot
What this does to a tensor depends on the properties of that tensor. A tensor is symmetric if it is unaffected by a the transpose of two slots. ^(19){ }^{19} If it is unaffected by the swap of any two slots it is totally symmetric
{:(31.43)P(u","v","w)=P(v","u","w)=P(w","u","v)=dots:}\begin{equation*}
P(u, v, w)=P(v, u, w)=P(w, u, v)=\ldots \tag{31.43}
\end{equation*}
A tensor is antisymmetric if the sign is reversed when two slots are swapped. ^(20){ }^{20} It is totally antisymmetric if the sign is reversed if any two slots are swapped.
{:(31.44)P(u","v","w)=-P(v","u","w)=+P(w","u","v)=dots:}\begin{equation*}
P(u, v, w)=-P(v, u, w)=+P(w, u, v)=\ldots \tag{31.44}
\end{equation*}
Symmetrization In components, the symmetrization of a (2,0)(2,0) tensor T\boldsymbol{T} involves the following action on the components:
A symmetric tensor then has the property T^((mu nu))=T^(mu nu)T^{(\mu \nu)}=T^{\mu \nu}. An antisymmetric tensor has the property A^((mu nu))=0A^{(\mu \nu)}=0. Symmetrization therefore extracts the symmetric part of a tensor. For a ( n,mn, m ) tensor we have the rule
{:(31.46)T_(mu dots sigma)^((alpha dots kappa))=(1)/(n!)((" Sum over all permutations ")/(" of the "n" indices "alpha dots kappa)).:}\begin{equation*}
T_{\mu \ldots \sigma}^{(\alpha \ldots \kappa)}=\frac{1}{n!}\binom{\text { Sum over all permutations }}{\text { of the } n \text { indices } \alpha \ldots \kappa} . \tag{31.46}
\end{equation*}
Antisymmetrization In components the antisymmetrization of a tensor involved the following action on the components:
A symmetric tensor has the property T^([mu nu])=0T^{[\mu \nu]}=0. An antisymmetric tensor has the property A^([mu nu])=A^(mu nu)A^{[\mu \nu]}=A^{\mu \nu}. Antisymmetrization therefore extracts the antisymmetric part of a tensor. For a (n,m)(n, m) tensor we have the rule for antisymmetrization that
{:(31.48)T_(mu dots sigma)^([alpha dots kappa])=(1)/(n!)((" Alternating sum over all ")/(" permutations of the "n" indices "alpha dots kappa)).:}\begin{equation*}
T_{\mu \ldots \sigma}^{[\alpha \ldots \kappa]}=\frac{1}{n!}\binom{\text { Alternating sum over all }}{\text { permutations of the } n \text { indices } \alpha \ldots \kappa} . \tag{31.48}
\end{equation*}
The wedge product is an antisymmetrized tensor product. Take vectors u\boldsymbol{u} and v\boldsymbol{v} and build the bivector b\boldsymbol{b} using the wedge product
^(19){ }^{19} If a (2,0)(2,0) tensor is symmetric then the components obey T^(mu nu)=T^(nu mu)T^{\mu \nu}=T^{\nu \mu}. ^(20){ }^{20} If a (2,0)(2,0) tensor is antisymmetric then the components obey A^(mu nu)=A^{\mu \nu}=-A^(nu mu)-A^{\nu \mu}.
We shall use these tricks in the chapters that follow.
We now have a set of vectors, 1 -forms and tensors with which to describe geometry. They can be written in a coordinate-independent form (i.e. v, tilde(sigma),T\boldsymbol{v}, \tilde{\boldsymbol{\sigma}}, \boldsymbol{T} ) or we can write them in terms of a set of coordinates by invoking basis vectors and components. By combining the objects using the relevant slots we can map between them and numbers. Each of these objects is useful in describing physical quantities commonly found in Nature.
In the next chapter, we will treat a special class of tensor that are created from a special combination of 1-forms. These are called differential forms and are of particular importance owing to their use in describing curvature.
Chapter summary
Vectors can be identified with derivatives. The tangent vector at a point parametrized by lambda\lambda on a curve P(lambda)\mathcal{P}(\lambda) is v=d//dlambda\boldsymbol{v}=\mathrm{d} / \mathrm{d} \lambda. A vector field allows us to find a vector at each point.
A vector can be combined with a 1 -form tilde(sigma)\tilde{\boldsymbol{\sigma}} to make a number. A 1 -form field gives a 1 -form at every point in space. It only makes sense to combine vectors and 1 -forms defined at the same point.
Tensors generalize the notion of vectors and 1 -forms. Tensors can be manipulated in a large number of ways to produce tensors of different valences.
Exercises
(31.1) The basis vectors and basis 1-forms for a coordinate frame are given by del//delx^(mu)\partial / \partial x^{\mu} and dx^(alpha)\boldsymbol{d} x^{\alpha}, respectively. Evaluate:
(a) (:dx^(1),del//delx^(1):)\left\langle\boldsymbol{d} x^{1}, \partial / \partial x^{1}\right\rangle,
(b) (:dx^(1),del//delx^(0):)\left\langle\boldsymbol{d} x^{1}, \partial / \partial x^{0}\right\rangle,
(c) del//delx^(alpha)*del//delx^(beta)\partial / \partial x^{\alpha} \cdot \partial / \partial x^{\beta},
(d) dx^(alpha)*dx^(beta)\boldsymbol{d} x^{\alpha} \cdot \boldsymbol{d} x^{\beta}.
(31.2) Find a 1 -form tilde(W)\tilde{\boldsymbol{W}} that acts on a displacement vec(X)\vec{X} to give the work done by a force vec(f)\vec{f} in Euclidean space.
(31.3) Expand (a) T^((alpha beta gamma))_(delta epsilonς)T^{(\alpha \beta \gamma)}{ }_{\delta \epsilon \varsigma}; (b) T^(alpha beta gamma)_([delta epsilon zeta])T^{\alpha \beta \gamma}{ }_{[\delta \epsilon \zeta]}.
(31.4) Consider the contraction A^(mu nu)T_(mu nu)A^{\mu \nu} T_{\mu \nu}, where T_(mu nu)T_{\mu \nu} are the components of some general tensor.
(a) Show that if A^(mu nu)A^{\mu \nu} is symmetric, only the symmetric part of T_(mu nu)T_{\mu \nu} contributes to the contraction.
(b) Show that if A^(mu nu)A^{\mu \nu} is antisymmetric, only the asymmetric part of T_(mu nu)T_{\mu \nu} contributes to the contraction.
(31.5) (a) Expand F_([mu nu;lambda])F_{[\mu \nu ; \lambda]} if F_(mu nu)F_{\mu \nu} is an antisymmetric tensor
(b) Show that
(31.7) Roger Penrose has invented a diagrammatic notation for tensor equations that provides an entertaining way of visualizing complicated expressions. Tensors are represented by shapes. The rules are that a valence (n,m)(n, m) tensor has nn lines emerging from its top and mm lines from the bottom. The position of the page allows one to keep track of indices. A thick bar denotes antisymmetrization and a wiggly bar denotes symmetrization (although the numerical factors associated with these
operations are not conventionally included, so can be drawn on the diagrams).
(a) Using these rules and referring to Fig. 31.10, explain why diagram (i) represents the component expression A^(mu nu)_(alpha beta gamma)+2A^(mu nu)_(beta gamma alpha)A^{\mu \nu}{ }_{\alpha \beta \gamma}+2 A^{\mu \nu}{ }_{\beta \gamma \alpha}.
(b) What expression does diagram (ii) represent? What about (iii)?
(c) If covariant derivatives are represented by a large circle around the tensor, suggest which equations of electromagnetism are represented by diagrams (iv) and (v).
(d) Which expression involving the Riemann tensor is represented by diagram (vi)?
See Penrose (2004) for more details.
(i)
(ii)
(iii)
(iv) =◻=\square
(v)
(vi)
Fig. 32.1 An example 2-form tilde(beta)(\tilde{\boldsymbol{\beta}}(, ) made from omega^(x)^^omega^(y)\boldsymbol{\omega}^{x} \wedge \boldsymbol{\omega}^{y}. ^(1){ }^{1} There's an analogy here between the vectors and 1 -forms that we have described, with their characteristic relascribed, with their characteristic rela-
tionship (:omega^(mu),e_(mu):)=delta^(mu)_(nu)\left\langle\boldsymbol{\omega}^{\mu}, \boldsymbol{e}_{\mu}\right\rangle=\delta^{\mu}{ }_{\nu}, and the idea tionship (:omega^(mu),e_(mu):)=delta_(nu)\left\langle\boldsymbol{\omega}^{\mu}, \boldsymbol{e}_{\mu}\right\rangle=\delta_{\nu}, , and the idea
of lattice vectors a_(i)\boldsymbol{a}_{i} and reciprocal latof lattice vectors a_(i)\boldsymbol{a}_{i} and reciprocal lat-
tice vectors A_(j)\boldsymbol{A}_{j} in solids, which have an tice vectors A_(j)\boldsymbol{A}_{j} in solids, which have an
analogous flat-space relationship A_(i)\boldsymbol{A}_{i}. analogous flat-space relationship A_(i)\boldsymbol{A}_{i} a_(j)=2pidelta_(ij)\boldsymbol{a}_{j}=2 \pi \delta_{i j}. We can use reciprocal lattice vectors to label sets of lattice planes, just as we label repeating planes using 1 -forms. See the book by Pauli, Section 10. ^(2){ }^{2} As we saw in the last chapter, we can also use the wedge product on vectors like u\boldsymbol{u} and v\boldsymbol{v} to make bivectors u^^v=u ox v-v ox u\boldsymbol{u} \wedge \boldsymbol{v}=\boldsymbol{u} \otimes \boldsymbol{v}-\boldsymbol{v} \otimes \boldsymbol{u}
Bivectors are antisymmetric ( 2,0 ) objects with two slots, each of which takes a 1 -form.
Differential forms
Good God, there's two of them!
Audience heckle (on seeing the comic Bernie Winters take the stage at the Glasgow Empire, after his brother Mike had been poorly received)
In the last chapter, we reacquainted ourselves with vectors and 1forms. With certain qualifications, the former can be thought of as arrows; the latter as a set of regularly repeating surfaces. In this chapter, we start with the 1 -form and use it to generate a family of objects known as differential forms or pp-forms. These have several applications in general relativity: a particularly useful example is that curvature can be described very efficiently as an object known as a 2\mathbf{2}-form. We will see how to encode information about areas and volumes in forms, and this will lead us, in Chapter 38, to the profound idea that every integral can be reinterpreted as an integral over a form.
32.1 2-forms
1-forms are the simplest of a family of tensor objects called differential forms. This family of forms can be built from 1-forms. Recall the geometric interpretation of a 1 -form as an infinite number of equally spaced surfaces. ^(1){ }^{1} By combining the surfaces of 1 -forms to build more complicated structures, we can generate all of the other differential forms.
Let's look at an example. Working in Cartesian coordinates, the surfaces of the basis 1-form omega^(mu)\boldsymbol{\omega}^{\mu} are perpendicular to the direction of the basis vector e_(mu)\boldsymbol{e}_{\mu}. By combining the surfaces omega^(x)\boldsymbol{\omega}^{x} and omega^(y)\boldsymbol{\omega}^{y} as shown in Fig. 32.1 we can create the structure of tubes shown in the figure. This structure corresponds to an antisymmetric (0,2)(0,2) tensor tilde(beta)(\tilde{\boldsymbol{\beta}}(,),whichis) , which is our first example of a 2\mathbf{2}-form. To achieve the combination of the surfaces symbolically, we use the wedge product to combine the 1 -forms. ^(2){ }^{2} The wedge product of two 1 -forms, tilde(sigma)\tilde{\boldsymbol{\sigma}} and tilde(tau)\tilde{\boldsymbol{\tau}}, is the antisymmetrized tensor product
The wedge-product operation does not generate an arbitrary (0,2)(0,2) tensor; rather, it is antisymmetric. ^(3){ }^{3} The antisymmetry is interpreted geometrically in terms of the tubes in Fig. 32.1 having a handedness. To visualize this, we could picture the 2 -form circulating in a particular direction in each tube as shown in Fig. 32.2.
Let's now consider a more general 2-form tilde(B)()=, tilde(sigma)^^ tilde(tau)\tilde{\boldsymbol{B}}()=,\tilde{\boldsymbol{\sigma}} \wedge \tilde{\boldsymbol{\tau}}, built from two arbitrary 1-forms tilde(sigma)\tilde{\boldsymbol{\sigma}} and tilde(tau)\tilde{\boldsymbol{\tau}}. Any 2 -form can be written in terms of its components B_(mu nu)B_{\mu \nu} as
where omega^(mu)\boldsymbol{\omega}^{\mu} are basis 1 -forms. The asymmetry of the 2 -form means that B_(mu nu)=-B_(nu mu)B_{\mu \nu}=-B_{\nu \mu}. The factor 1//21 / 2 is a convention, that we justify in the next example. ^(4){ }^{4}
Example 32.1
The motivation for the 1//21 / 2 factor is seen by expanding the 2 -form
where, in the final step we use asymmetry to say B_(alpha beta)=-B_(beta alpha)B_{\alpha \beta}=-B_{\beta \alpha}.
Just as with the tensors from the last chapter, it is possible to use the slot-machine structure of forms to extract their components.
Example 32.2
We can extract the components of the 2 -form tilde(B)= tilde(sigma)^^ tilde(tau)\tilde{\boldsymbol{B}}=\tilde{\boldsymbol{\sigma}} \wedge \tilde{\boldsymbol{\tau}} in terms of the components of tilde(sigma)\tilde{\boldsymbol{\sigma}} and tilde(tau)\tilde{\boldsymbol{\tau}} by inserting basis vectors e_(mu)\boldsymbol{e}_{\mu} into the slots in the expression tilde(B)()= tilde(sigma)()^^ tilde(tau)()\tilde{\boldsymbol{B}}()=\tilde{\boldsymbol{\sigma}}() \wedge \tilde{\boldsymbol{\tau}}() to find
^(3){ }^{3} Recall from the previous chapter that a (0,2)(0,2) tensor is antisymmetric if tilde(B)(x,y)=-B(y,x)\tilde{B}(\boldsymbol{x}, \boldsymbol{y})=-\boldsymbol{B}(\boldsymbol{y}, \boldsymbol{x}), where x\boldsymbol{x} and y\boldsymbol{y} are vectors.
Fig. 32.2 The 1 -forms omega^(x)\boldsymbol{\omega}^{x} and omega^(y)\boldsymbol{\omega}^{y} can be thought of as sets of parallel surfaces. Their combination gives the tubular structure described by omega^(x)^^omega^(y)\boldsymbol{\omega}^{x} \wedge \boldsymbol{\omega}^{y}, with a particular handedness, as depicted by the arrows in the tubes. ^(4){ }^{4} As shown in the example, the factor 1//21 / 2 only arises when we're summing over indices, as in eqn 32.5. If we choose to specify specific components, rather than sum over them, no factor explicit factor is needed. For example, if a 2 form in Euclidean 3-space has only one non-zero component, we write
=gamma[delta_(mu)^(2)delta_(nu)^(3)-delta_(mu)^(3)delta_(nu)^(2)]=\gamma\left[\delta_{\mu}^{2} \delta_{\nu}^{3}-\delta_{\mu}^{3} \delta_{\nu}^{2}\right],
so B_(23)=gammaB_{23}=\gamma and B_(32)=-gammaB_{32}=-\gamma. ^(5){ }^{5} Recall from the last chapter that the spacing of the surfaces in the Cartesian mu\mu-direction is 1//sigma_(mu)1 / \sigma_{\mu}.
Fig. 32.3 An arbitrary 2-form. ^(6){ }^{6} We define a 0 -form to be a function such as f(x)f(x). ^(7){ }^{7} For example in three-dimensional space
Using (i) B_(mu nu)=-B_(nu mu)B_{\mu \nu}=-B_{\nu \mu} and (ii) omega^(mu)^^\boldsymbol{\omega}^{\mu} \wedgeomega^(nu)=-omega^(nu)^^omega^(mu)\boldsymbol{\omega}^{\nu}=-\boldsymbol{\omega}^{\nu} \wedge \boldsymbol{\omega}^{\mu}, we can simplify this to read tilde(B)=(1)/(2)B_(mu nu)(omega^(mu)^^omega^(nu))\tilde{\boldsymbol{B}}=\frac{1}{2} B_{\mu \nu}\left(\boldsymbol{\omega}^{\mu} \wedge \boldsymbol{\omega}^{\nu}\right)=(B_(12)omega^(1)^^omega^(2)+B_(13)omega^(1)^^omega^(3):}=\left(B_{12} \boldsymbol{\omega}^{1} \wedge \boldsymbol{\omega}^{2}+B_{13} \boldsymbol{\omega}^{1} \wedge \boldsymbol{\omega}^{3}\right.{:+B_(23)omega^(2)^^omega^(3))\left.+B_{23} \omega^{2} \wedge \omega^{3}\right)=B_(|mu nu|)omega^(mu)^^omega^(nu)=B_{|\mu \nu|} \boldsymbol{\omega}^{\mu} \wedge \boldsymbol{\omega}^{\nu}.
Example 32.3
Let's try inserting a single vector u\boldsymbol{u} into the first slot of a 2 -form. (Remember that the tensor products ox\otimes are intended to maintain the order of the linear slots.) We find
The result is a 1 -form: a (0,1)(0,1) tensor object with a single available slot that accepts a vector.
An arbitrary 1 -form tilde(sigma)=sigma_(mu)omega^(mu)\tilde{\boldsymbol{\sigma}}=\sigma_{\mu} \boldsymbol{\omega}^{\mu} is represented by a series of surfaces oriented according to, and with spacing determined by, the components ^(5){ }^{5}. The product of two such arbitrary 1-forms resembles the example shown in Fig. 32.3 made of intersecting surfaces giving a series of (parallelepiped) tubes. As in our simple example earlier, the tubes come with a sense of direction since there is a difference between tilde(sigma)^^ tilde(tau)\tilde{\boldsymbol{\sigma}} \wedge \tilde{\boldsymbol{\tau}} and tilde(tau)^^ tilde(sigma)\tilde{\boldsymbol{\tau}} \wedge \tilde{\boldsymbol{\sigma}}. We think of the 2 -form field as circulating in each of the cells.
32.2 -forms
We've seen that the wedge product of two 1 -forms is a 2 -form. In fact, we can generalize this notion and say that, if tilde(mu)\tilde{\boldsymbol{\mu}} is a pp-form and tilde(nu)\tilde{\boldsymbol{\nu}} is a qq-form, then we can use the wedge product to make a (p+q)(p+q)-form tilde(mu)^^ tilde(nu)\tilde{\boldsymbol{\mu}} \wedge \tilde{\boldsymbol{\nu}}, with the property that
Using this idea, we can generate pp-forms by making more and more wedge products. ^(6){ }^{6}
A p-form tilde(alpha)\tilde{\boldsymbol{\alpha}} can be written as
Here, the vertical bars around a list of i_(n)i_{n} mean we fix i_(1) < i_(2) < dots < i_(p)i_{1}<i_{2}<\ldots<i_{p} and don't therefore need the 1//p1 / p ! normalization in the first. ^(7){ }^{7}
Example 32.4
Let's make a 3 -form from the 1 -forms tilde(sigma), tilde(tau)\tilde{\boldsymbol{\sigma}}, \tilde{\boldsymbol{\tau}} and tilde(kappa)\tilde{\boldsymbol{\kappa}}. We have
Geometrically, we can build this object by stepping up from the equally spaced planes of the 1-form, via the tube-like structure of the 2 -form to the cellular structure of the 3 -form.
A simple example of a pp-form that we can build in nn-dimensional space involves multiplying a scalar function alpha_(|i_(1),i_(2)dotsi_(p)|)=f(x^(1),x^(2),dots,x^(p))\alpha_{\left|i_{1}, i_{2} \ldots i_{p}\right|}=f\left(x^{1}, x^{2}, \ldots, x^{p}\right) by wedge products of the basis 1 -forms or, equivalently, the differentials of coordinates to build a pp-form tensor tilde(beta)\tilde{\boldsymbol{\beta}}
All pp-forms are linear and can therefore be added to other pp-forms to make the arbitrary pp-form of your choosing.
Example 32.5
In three-dimensional space with basis 1 -forms omega^(1)=dx,omega^(2)=dy\boldsymbol{\omega}^{1}=\boldsymbol{d} x, \boldsymbol{\omega}^{2}=\boldsymbol{d} y and omega^(3)=dz\boldsymbol{\omega}^{3}=\boldsymbol{d} z, we can investigate some pp-form objects that we make from functions of the coordinates (x,y,z)(x, y, z) and the three basis forms:
0 -form quad tilde(alpha)=f(x,y,z)\quad \tilde{\boldsymbol{\alpha}}=f(x, y, z),
1-form quad tilde(beta)=A(x,y,z)dx+B(x,y,z)dy+C(x,y,z)dz\quad \tilde{\boldsymbol{\beta}}=A(x, y, z) \boldsymbol{d} x+B(x, y, z) \boldsymbol{d} y+C(x, y, z) \boldsymbol{d} z,
2-form quad tilde(mu)=D(x,y,z)dx^^dy+E(x,y,z)dy^^dz+F(x,y,z)dz^^dx\quad \tilde{\boldsymbol{\mu}}=D(x, y, z) \boldsymbol{d} x \wedge \boldsymbol{d} y+E(x, y, z) \boldsymbol{d} y \wedge \boldsymbol{d} z+F(x, y, z) \boldsymbol{d} z \wedge \boldsymbol{d} x, 3-form quad tilde(nu)=G(x,y,z)dx^^dy^^dz\quad \tilde{\nu}=G(x, y, z) \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z.
Notice how having two identical 1-forms in any wedge product causes it to vanish (e.g. dx^^dy^^d=0\boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d}=0 ). This constrains the possible pp-forms to those above. It is not possible, for example, for a 4 -form to exist in three spatial dimensions.
32.3 p32.3 p-vectors
Consider vectors u\boldsymbol{u} and v\boldsymbol{v} in Euclidean 2-space. We can build a parallelogram in Fig. 32.4 from these vectors. Do this by starting at the origin and drawing u\boldsymbol{u} and then v\boldsymbol{v}, returning to the origin and drawing v\boldsymbol{v} and then u\boldsymbol{u}. The area of this parallelogram is given by the computation of the determinant
where epsi_(ij)\varepsilon_{i j} is the two-dimensional Levi-Civita symbol. ^(8){ }^{8} The signed area is antisymmetric in the components of u\boldsymbol{u} and v\boldsymbol{v}, which can be thought of as giving a handedness to the area. The area of the parallelogram is related to the bivector b\boldsymbol{b} defined as the antisymmetric (2,0)(2,0) tensor ^(9)^{9}
In fact, it is fairly easy to see that the area is of the parallelogram is equal to the component b^(xy)b^{x y}, accessible by inserting basis 1 -forms omega^(x)\boldsymbol{\omega}^{x} and omega^(y)\boldsymbol{\omega}^{y} into the slots of the bivector [i.e. b(omega^(x),omega^(y))\boldsymbol{b}\left(\boldsymbol{\omega}^{x}, \boldsymbol{\omega}^{y}\right) ]. We will return to this in Chapter 37.
For now, we note that the existence of pp-vectors, formed from the wedge products of vectors with components
^(8){ }^{8} In two dimensions, epsi_(12)=1,epsi_(21)=\varepsilon_{12}=1, \varepsilon_{21}= -1 , and epsi_(11)=epsi_(22)=0\varepsilon_{11}=\varepsilon_{22}=0. In general, epsi_(i_(1)dotsi_(n))=(-1)^(P)\varepsilon_{i_{1} \ldots i_{n}}=(-1)^{P}, where PP is the parity of the permutation of 1,2,3dots n1,2,3 \ldots n in the indices (i.e. how many pairwise swaps need to be made to a string of numbers to work it back to the order 1dots n1 \ldots n ), and epsi_(i_(1)dotsi_(n))=0\varepsilon_{i_{1} \ldots i_{n}}=0 if any of the indices are repeated.
Fig. 32.4 The parallelogram formed by vectors u\boldsymbol{u} and v\boldsymbol{v}. ^(9){ }^{9} As for the case of forms, we can write the components of a bivector as
{[+1," if "(i_(1)dotsi_(p))" is an even "],[," permutation of "(j_(1)dotsj_(p))],[-1," if "(i_(1)dotsi_(p))" is an odd "],[," permutation of "(j_(1)dotsj_(p))],[0," any two "i" s are the same "],[0," any two "j" s are the same "],[0," the is and "j" s are "],[," different integers. "]:}\begin{cases}+1 & \text { if }\left(i_{1} \ldots i_{p}\right) \text { is an even } \\ & \text { permutation of }\left(j_{1} \ldots j_{p}\right) \\ -1 & \text { if }\left(i_{1} \ldots i_{p}\right) \text { is an odd } \\ & \text { permutation of }\left(j_{1} \ldots j_{p}\right) \\ 0 & \text { any two } i \text { s are the same } \\ 0 & \text { any two } j \text { s are the same } \\ 0 & \text { the is and } j \text { s are } \\ & \text { different integers. }\end{cases}
Fig. 32.5 The bivector b\boldsymbol{b} and 2-form tilde(F)\tilde{\boldsymbol{F}} are used to for the inner product (: tilde(F),b:)\langle\tilde{\boldsymbol{F}}, \boldsymbol{b}\rangle. This outputs a number equal to the number of tubes of tilde(F)\tilde{\boldsymbol{F}} contained within the parallelogram b\boldsymbol{b}, which in this case is 4 .
The existence of pp-vectors allows us to define a set of generalized inner products of pp-forms, such that pp-vectors and pp-forms can be mapped on to numbers. In order to make the inner product of a pp-form tilde(alpha)=\tilde{\boldsymbol{\alpha}}=alpha_(|i_(1)dotsi_(p)|)omega^(i_(1))^^dots^^omega^(i_(p))\alpha_{\left|i_{1} \ldots i_{p}\right|} \boldsymbol{\omega}^{i_{1}} \wedge \ldots \wedge \boldsymbol{\omega}^{i_{p}} and a pp-vector v=v^(|j_(1)dotsj_(p)|)e_(j_(1))^^dots^^e_(j_(p))\boldsymbol{v}=v^{\left|j_{1} \ldots j_{p}\right|} \boldsymbol{e}_{j_{1}} \wedge \ldots \wedge \boldsymbol{e}_{j_{p}}, we have the rule ^(10){ }^{10}
Example 32.6
Work in three spatial dimensions, with components arranged in the order (x^(1),x^(2),x^(3))=(x,y,z)\left(x^{1}, x^{2}, x^{3}\right)=(x, y, z). Combine a 2 -form tilde(F)\tilde{\boldsymbol{F}} and a bivector b\boldsymbol{b} to make a number thus
This has a pleasing resemblance to the dot product of vectors, or the inner product of a vector and a 1 -form. The geometrical interpretation of this is that the scalar outputted by the inner product represents the number of tubes of the 2 -form tilde(F)\tilde{\boldsymbol{F}} contained in the parallelogram representing the bivector b\boldsymbol{b} (Fig. 32.5).
It's also useful to note that if the bivector b\boldsymbol{b} is formed from vectors c\boldsymbol{c} and d\boldsymbol{d} via b=c^^d\boldsymbol{b}=\boldsymbol{c} \wedge \boldsymbol{d} then we also have that
To add to our list of known tensor objects, we now have pp-forms and qq-vectors, which are examples of antisymmetric tensors. In the next chapter, our study of differential geometry starts in earnest, as we start to take derivatives.
Chapter summary
Differential forms can be built from 1-forms using the wedge product. A pp-form is an antisymmetric (0,p)(0, p) tensor.
2 -forms can be represented as a tube-like structure with a handedness.
pp-vectors can be built from vectors using the wedge product. A pp-vector and a pp-form can be combined to make a number.
Exercises
(32.1) Verify eqn 32.23 by computing components.
(32.2) Fermi transport: Consider a sphere, and vectors a,b\boldsymbol{a}, \boldsymbol{b} and c\boldsymbol{c} that are orthogonal to each other and to the velocity vector u\boldsymbol{u} of the sphere (i.e. the vector field that is tangent to the sphere's world line). The three vectors determine the orientation of the sphere along its world line and can be used to measure the rotation of the sphere. We can write down the equations of motion that tells us there is no rotation. The extent to which this equation is satisfied measures the rotation. The equations of motion, known as the Fermi transport equations,
are
where A=grad_(u)u\boldsymbol{A}=\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{u} is the acceleration.
(a) Write one of these equations in components.
(b) What happens for transport along a geodesic world line?
(c) Show that if a,b\boldsymbol{a}, \boldsymbol{b} and c\boldsymbol{c} are initially orthogonal, they stay orthogonal.
(d) Show that a vector transported by these equations remains orthogonal to the four-velocity.
(e) Compare Fermi transport with parallel transport.
Fig. 33.1 The action of the exterior derivative d\boldsymbol{d}, on the 1 -form f(x)dyf(x) \boldsymbol{d} y is to convert it into del f(x)//del x(dx^^dy)\partial f(x) / \partial x(\boldsymbol{d} x \wedge \boldsymbol{d} y).
Exterior and Lie derivatives
More ways of killing a cat than choking her with cream
Charles Kingsley (1819-1875)
Derivatives are undeniably important in physics. We would like a way of taking derivatives of tensor fields that does not rely on coordinates. We naturally reach the idea of a derivative if we think in terms of a tensors changing as we move around a space. More precisely, we work with the notion of a tensor field which, at each point in space, provides us with a tensor. In working out derivatives, we are faced with the problem that there is no obvious way to compare tensor fields evaluated at two different points in a space. One way around this is to use the metric to define a notion of parallelism. This requires a connection: a method of connecting different points, so that tensors can be moved in a parallel manner around the space. We examine this connection in more detail in Chapter 34. If we don't have this parallelism (or we don't have a metric) then how can we proceed? In this chapter, we examine two ways of dealing with more primitive notions of rates of change of tensors. The first type of derivative applies only to fields of forms and is called the exterior derivative. We shall later use this derivative to describe the curvature of spacetime. The second type of derivative applies to all tensors and relies on being able to carry the tensor field of interest across the space using a separate vector field. This is called a Lie derivative and is the subject of the second part of this chapter. The Lie derivative is especially useful in cosmology for describing changes as we travel along with the cosmological fluid. However, perhaps its most useful feature is in identifying conserved quantities via Killing vectors, which we describe at the end of the chapter.
33.1 Exterior calculus
The exterior derivative is a differential operation that can be applied to forms. In measuring a rate of change of a pp-form, exterior differentiation converts the pp-form into a (p+1)(p+1)-form. This has a pictorial interpretation which is clearest for the action of the exterior derivative on a 1 -form and is shown in Fig. 33.1. Recall that a 1 -form can be thought of as a set of parallel surfaces in space, while a 2 -form is a tubular structure. By measuring how fast a component of the 1 -form changes along a some direction, the exterior derivative builds us a 2 -form by adding surfaces normal to this direction.
To use the exterior derivative we define an operator d\boldsymbol{d} that acts on forms. The simplest 1 -form is the differential df\boldsymbol{d} f. This 1 -form can be thought of as originating from the action of the exterior derivative operator d\boldsymbol{d} on the 0 -form f(x)f(x). In general, the operator d_( tilde(d))\underset{\tilde{d}}{\boldsymbol{d}} can be thought of as a (0,1)(0,1) object that creates the (p+1)(p+1)-form d tilde(A)\boldsymbol{d} \tilde{\boldsymbol{A}} from a pp-form tilde(A)\tilde{\boldsymbol{A}}. It obeys the rules listed in the margin.
The rules feel rather austere. We get a clearer idea of what the operator does in practice if we consider its action on an arbitrary pp-form
As with all calculus, the only way to really see what's going on is to examine some examples.
Example 33.1
We work in (3+1)-dimensional spacetime with spherical coordinates x^(mu)=(t,chi,theta,phi)x^{\mu}=(t, \chi, \theta, \phi).
We act with d\boldsymbol{d} on a 1 -form tilde(A)=A_(alpha)(t,chi,theta,phi)dx^(alpha)\tilde{\boldsymbol{A}}=A_{\alpha}(t, \chi, \theta, \phi) \boldsymbol{d} x^{\alpha}, where A_(alpha)A_{\alpha} are functions of the coordinates. We will obtain
Example 0: Consider a 1 -form tilde(A)=d\tilde{\boldsymbol{A}}=\boldsymbol{d}. Acting with d\boldsymbol{d} we obtain
{:(33.12)d tilde(A)=ddt=0:}\begin{equation*}
d \tilde{A}=d d t=0 \tag{33.12}
\end{equation*}
where we've used dd=0\boldsymbol{d} \boldsymbol{d}=0 (rule IV in the box).
Example 1: Consider a 1-form tilde(A)=a(t)d chi\tilde{\boldsymbol{A}}=a(t) \boldsymbol{d} \chi, where a(t)a(t) is a scalar function of tt only.
Acting with d\boldsymbol{d} we obtain
where we've used the fact that del a(t)//delx^(i)=0\partial a(t) / \partial x^{i}=0, for i!=ti \neq t.
Example 2: Consider a 1 -form vec(A)=a(t)sin theta d theta\overrightarrow{\boldsymbol{A}}=a(t) \sin \theta \boldsymbol{d} \theta. Taking the exterior derivative we obtain
where we've used the fact that dx^(mu)^^dx^(mu)=0\boldsymbol{d} x^{\mu} \wedge \boldsymbol{d} x^{\mu}=0.
The exterior derivative operator d\boldsymbol{d} has the following properties when acting on the pp-form tilde(alpha)\tilde{\boldsymbol{\alpha}} and qq-form beta\boldsymbol{\beta} : I: It is linear
Example 3: Consider a 1-form tilde(A)=a(t)sin chi sin theta d phi\tilde{\boldsymbol{A}}=a(t) \sin \chi \sin \theta \boldsymbol{d} \phi. This time we have the exterior derivative
d tilde(A)=(del a(t))/(del t)sin theta dt^^d phi+a(t)cos chi sin theta d chi^^d phi+a(t)sin chi cos theta d theta^^d phi\boldsymbol{d} \tilde{\boldsymbol{A}}=\frac{\partial a(t)}{\partial t} \sin \theta \boldsymbol{d} t \wedge \boldsymbol{d} \phi+a(t) \cos \chi \sin \theta \boldsymbol{d} \chi \wedge \boldsymbol{d} \phi+a(t) \sin \chi \cos \theta \boldsymbol{d} \theta \wedge \boldsymbol{d} \phi
All of the 1 -forms chosen in this example will be met again when we examine the metric of the expanding Universe.
The rule dd=0\boldsymbol{d} \boldsymbol{d}=0 puts constraints on the sorts of objects that can be accommodated in (3+1)-dimensional spacetime, as we shall see in the next example.
Example 33.2
We work in (3+1)-dimensional spacetime. Starting with a function ff, we can construct a 1 -form tilde(alpha)\tilde{\boldsymbol{\alpha}} using the exterior derivative
{:(33.16) tilde(alpha)=df:}\begin{equation*}
\tilde{\boldsymbol{\alpha}}=\boldsymbol{d} f \tag{33.16}
\end{equation*}
The 1 -form looks like a set of evenly spaced planes. If we try to obtain a 2 -form by acting on tilde(alpha)\tilde{\boldsymbol{\alpha}} with the exterior derivative we obtain
The 2-form looks like a set of tubes in space, with the 2 -form field circulating in each of the tubes. If we try to obtain a 3 -form from the 2 -form using the exterior derivative we obtain
{:(33.19)d tilde(F)=dd tilde(A)=0:}\begin{equation*}
d \tilde{\boldsymbol{F}}=d d \tilde{\boldsymbol{A}}=0 \tag{33.19}
\end{equation*}
Now start with a 2 -form tilde(nu)\tilde{\boldsymbol{\nu}} and construct a 3 -form by acting on tilde(nu)\tilde{\boldsymbol{\nu}} with d\boldsymbol{d}, so that we have
The 3 -form tilde(mu)\tilde{\mu} looks like a set of cubes with the 3 -form field circulating inside. If we try to obtain a 4 -form from the 3 -form using the exterior derivative we obtain
{:(33.21)d tilde(mu)=dd tilde(nu)=0:}\begin{equation*}
d \tilde{\mu}=d d \tilde{\nu}=0 \tag{33.21}
\end{equation*}
Continuing, we can start with a 3 -form and make a 4 -form using d\boldsymbol{d}. A 5 -form, however, cannot be accommodated in our four-dimensional space. ^(2){ }^{2} Marius Sophus Lie (1842-1899). His name is pronounced 'Lee' (as in Bruce Lee, to rhyme with ski). Continuous transformation groups are now called Lie groups in honour of his greatest work in mathematics, and are very widely used in quantum mechanics Élie Cartan was a doctoral student of Lie. ◻\square
We now turn to the second sort of derivative for coordinate-free geometry: the Lie derivative. ^(2){ }^{2} This derivative is most simply applied using the notion of a commutator, so we first pause to explore this in the next section.
33.2 Commutators
Adopting the idea of a vector as an operator, we define the commutator [u,v][\boldsymbol{u}, \boldsymbol{v}] of two vector fields u=d//dlambda\boldsymbol{u}=\mathrm{d} / \mathrm{d} \lambda and v=d//dmu\boldsymbol{v}=\mathrm{d} / \mathrm{d} \mu as
We could write the right-hand side of this equation out as another vector field w^(beta)del//delx^(beta)w^{\beta} \partial / \partial x^{\beta}, acting on the function ff. The components w^(beta)w^{\beta} are given by w^(beta)=w^{\beta}=(u^(alpha)(delv^(beta))/(delx^(alpha))-v^(alpha)(delu^(beta))/(delx^(alpha)))\left(u^{\alpha} \frac{\partial v^{\beta}}{\partial x^{\alpha}}-v^{\alpha} \frac{\partial u^{\beta}}{\partial x^{\alpha}}\right), which will, in general, give is a set of non-zero numbers. We conclude that the commutator generates another vector field, whose components do not, in general, vanish.
Another conclusion that can be drawn from the previous example, through the removal of the function is that the commutator, written in terms of coordinates, is
This expression, built from directional derivatives, gives rise to a simple pictorial view of the commutator. The first term is an instruction to evaluate the difference Delta_(2)\boldsymbol{\Delta}_{2} in the vector field v\boldsymbol{v} evaluated at the tip and base of the vector u\boldsymbol{u}, as shown in Fig. 33.2(b). The second term tells us to evaluate the difference Delta_(1)\boldsymbol{\Delta}_{1} in u\boldsymbol{u} evaluated at the tip and base of the vector v\boldsymbol{v}, as shown in Fig. 33.2(a). The commutator (Delta_(2)-Delta_(1))\left(\boldsymbol{\Delta}_{2}-\boldsymbol{\Delta}_{1}\right) therefore tells us to move along the vector v\boldsymbol{v} and then along the vector u\boldsymbol{u} and compare this to the final position if we move along u\boldsymbol{u} and then v\boldsymbol{v}. In pictures, this is a measurement of the vector that closes the figures in Fig. 33.2(c). If this were Euclidean space with constant vector fields, then the parallelogram that is made is a closed figure and the commutator is zero. However, in general, we cannot guarantee that vector fields make closed figures when transported along each other's lengths. The amount by which the figure fails to close is measured by the vector that is outputted by the commutator, as shown in Fig. 33.2(c). So, in short, the commutator measures the amount by which our attempt to make a parallelogram from vectors u\boldsymbol{u} and v\boldsymbol{v} fails to close.
In Chapter 10, we discussed the case that the vectors in the Lie brackets are the basis vectors e_(mu)e_{\mu} of a space. Since basis vectors should, in coordinates, be expressible simply as partial derivatives del//delx^(beta)\partial / \partial x^{\beta}, then we expect the part in the brackets to vanish. If it does not, it implies that
Fig. 33.2 (a) The sequence vu\boldsymbol{v} \boldsymbol{u} involves following v\boldsymbol{v} then u\boldsymbol{u}. This vector u\boldsymbol{u} at the tip of v\boldsymbol{v} will generally be different to the one at the base of v\boldsymbol{v}. (b) The sequence uv\boldsymbol{u v} involves following u\boldsymbol{u} and then v\boldsymbol{v}. (c) The commutator [u,v][\boldsymbol{u}, \boldsymbol{v}]. ^(3){ }^{3} This is just what we discussed in Chapters 3 and 10. ^(4){ }^{4} Although this might all seem somewhat contrived, the process of being carried along by a vector field is an example of a diffeomorphism, which is the generalization of an active cooris the generalization of an active coor-
dinate transformation to the manifold. dinate transformation to the maniford. Diffeomorphisms allow us to describe
the smoothness of a spacetime and are important in the mathematical description of relativity. See Appendix C for more details. In terms of physics, the Lie derivative is so important to us because, in cosmology, we consider that we are all being carried along the current of the cosmological fluid of matter that fills the Universe. Fluids themselves are examined in detail in Chapter 39 .
Fig. 33.3 Moving along a congruence of curves by an amount Delta lambda\Delta \lambda and comparing vectors from the field v\boldsymbol{v}. The paring vectors from the field vv. The
congruence is formed from the streamcongruence is formed from the stream-
lines of a vector field u\boldsymbol{u}. We take Lie lines of a vector field u\boldsymbol{u}. We take Lie
derivatives with respect to this second derivat
field. ^(5){ }^{5} The pound sign ££ means 'L' for Lie here. It's use in denoting the currency stems from the origin of the Enrency stems from the origin of the En-
glish word pound, which comes from glish word pound, which comes from
the Latin libra pondo meaning a 'pound the Latin lib ^(6){ }^{6} In Chapter 31, we used given curves to provide a tangent vector fields. Now we turn the tables and take a vector field to generate a series of curves to which the field is tangent.
it is impossible to express the basis in terms of derivatives of coordinates. Such bases are non-coordinate bases. In contrast, the set of basis vectors {e_(mu)}\left\{\boldsymbol{e}_{\mu}\right\} is a coordinate basis if and only if [e_(alpha),e_(beta)]=0.^(3)\left[\boldsymbol{e}_{\alpha}, \boldsymbol{e}_{\beta}\right]=0 .^{3} In the formulation of general relativity, many expressions are simplest when referred to a coordinate basis. However, we make and interpret measurements locally in orthonormal frames, which are generally noncoordinate frames.
33.3 Lie derivatives of vectors
Previously, we saw that the exterior derivative measures the rate of change of a 1-form in the direction of the normal to the 1-form's surface. When generalized, this derivative applied only to pp-forms. We now turn to an alternative, primitive, method of taking derivatives. Recall that a problem we face is that it's difficult to know how to compare tensor fields at two different points in space. Lie differentiation presents a solution to this problem by using a second vector field to provide a means of moving the original vectors around.
Imagine being carried along by the current in a river. ^(4){ }^{4} In order to evaluate the change of some vector field v\boldsymbol{v}, we can extract a vector at some position, allow ourselves to be carried along by the river current, and then compare the vector we originally extracted to another one at the new position at which we find ourselves. This is the essence of the Lie derivative. The current of the river can itself be represented by a second vector field u\boldsymbol{u} representing the velocity field of the river fluid. The vectors from the field u\boldsymbol{u} are tangent to a congruence of curves that we call the streamlines of the vector field u\boldsymbol{u}. These are shown in Fig. 33.3. We need to know both v\boldsymbol{v} and u\boldsymbol{u} to take the Lie derivative, which will turn out simply to be given by the commutator of the two fields [u,v][\boldsymbol{u}, \boldsymbol{v}].
Lie differentiation doesn't just apply to vectors like v\boldsymbol{v}; it can be applied to any tensor Z\boldsymbol{Z}. We write the Lie derivative of Zas^(5)\boldsymbol{Z} \mathrm{as}^{5}
{:(33.26)£_(u)Z=((" Lie derivative of tensor "Z)/(" carried along vector field "u)):}\begin{equation*}
£_{\boldsymbol{u}} \boldsymbol{Z}=\binom{\text { Lie derivative of tensor } \boldsymbol{Z}}{\text { carried along vector field } \boldsymbol{u}} \tag{33.26}
\end{equation*}
We can think of the Lie derivative as being a derivative of Z\boldsymbol{Z} with respect to u\boldsymbol{u}, where the field u\boldsymbol{u} is used to provide a yardstick against which to measure changes in Z\boldsymbol{Z}. Importantly, if there is no difference in a tensor after being carried along by the current, we have £_(u)Z=0£_{\boldsymbol{u}} \boldsymbol{Z}=0 and we say that the tensor field has been Lie dragged. Geometrically, a Liedragged vector that joins two streamlines at some point will continue to join them after being carried along by the current, as shown in Fig. 33.4.
Next, we show that the Lie derivative of a vector field v\boldsymbol{v} obeys £_(u)v=£_{\boldsymbol{u}} \boldsymbol{v}=[u,v][\boldsymbol{u}, \boldsymbol{v}]. First, we need a suitable set of streamlines along which to carry the field. In a fluid, a streamline is a curve whose tangent at some point P\mathcal{P} gives the velocity u\boldsymbol{u} of the element of fluid at P\mathcal{P}. So we can think of the vector field u(x)\boldsymbol{u}(x) as supplying the streamlines. ^(6){ }^{6}
Example 33.4
More formally, we have a congruence when there is a unique curve c(lambda)c(\lambda) through each point P\mathcal{P} of space such that: (i) c(lambda=0)=Pc(\lambda=0)=\mathcal{P}, and (ii) the tangent vector at a point c(lambda)c(\lambda) is u quad\boldsymbol{u} \quad This curve therefore has a tangent vector that is always given by the c(lambda)col_(c)(lambda)c(\lambda) \operatorname{col}_{c}(\lambda). The curve is called an integral curve of the vector field uu. by the obvious that such a curve will exist for some given field u\boldsymbol{u}, so we pause to consider it. The components of the vector field u\boldsymbol{u} at P\mathcal{P} are u^(mu)(P)u^{\mu}(\mathcal{P}) and change as a function it. The components of the vector field u\boldsymbol{u} at P\mathcal{P} are u^(mu)(P)u^{\mu}(\mathcal{P}) and change as a function of this position. If we work in a coordinate system the curve has coordinates x^(mu)(lambda)x^{\mu}(\lambda). Saying that the components u^(mu)(x^(1)(lambda),x^(2)(lambda),dotsx^(n)(lambda))u^{\mu}\left(x^{1}(\lambda), x^{2}(\lambda), \ldots x^{n}(\lambda)\right) are tangent to the curve with parameter lambda\lambda amounts to the set of differential equations
This set of first-order differential equations is guaranteed ^(7){ }^{7} to have unique solutions in the region near P\mathcal{P}. This result guarantees the existence of the required congruence of integral curves.
With all of the ingredients in place, we're now ready to compute the derivative. We take the vector field at P\mathcal{P}, called v(P)\boldsymbol{v}(\mathcal{P}), we carry this vector a distance Delta lambda\Delta \lambda along an integral curve of the vector field u\boldsymbol{u} to a point Q\mathcal{Q}. This turns v(P)\boldsymbol{v}(\mathcal{P}) into the vector v^(')(Q)\boldsymbol{v}^{\prime}(\mathcal{Q}). We compare this transported vector to the vector v(Q)\boldsymbol{v}(\mathcal{Q}) found at Q\mathcal{Q}. The definition of the Lie derivative is then
In the next example, we carry out this computation in a coordinate system.
Example 33.5
In terms of a coordinate system, we have a curve parametrized by lambda\lambda with a tangent vector with components u^(mu)=(dx^(mu))/(d lambda)u^{\mu}=\frac{d x^{\mu}}{d \lambda}. The point P\mathcal{P} has a coordinate x^(mu)x^{\mu}. The coordinates of the nearby point Q\mathcal{Q} are x^(mu^('))x^{\mu^{\prime}} and these can be found using a coordinate transformation that translates us from P\mathcal{P} to Q\mathcal{Q}, along the vector u\boldsymbol{u}
To find the Lie derivative, we need to compare the field at a point Q\mathcal{Q} (with components v^(beta)v^{\beta} ) to the field carried from P\mathcal{P} to Q\mathcal{Q} (with components v^(beta^('))v^{\beta^{\prime}} )
Step I: We first work out the components of the transported vector. We do this by transforming the vector into the coordinate frame at Q\mathcal{Q} using the tensor transformation law ^(8){ }^{8} and then substituting eqn 33.29, thus
Fig. 33.4 A Lie dragged vector continues to link two streamlines after being carried along the congruence. ^(7){ }^{7} This is a result which is proved in many of standard references. An example is Choquet-Bruhat, DeWittMorette and Dillard-Bleick Analysis, Manifolds and Physics.
Fig. 33.5 (a) The effect of transporting the vector v\boldsymbol{v} from P\mathcal{P} to Q\mathcal{Q} is measured in terms of the change in u\boldsymbol{u}. (b) The vector v\boldsymbol{v} evaluated at P\mathcal{P} and Q\mathcal{Q}. (c) The difference between (a) and (b) (c) The difference between (a) and (b) gives the Lie derivative: a vector that
closes the circuit. closes the circuit. ^(8){ }^{8} Remember that components must transform as A^(mu^('))=(delx^(mu^(')))/(delx^(nu))A^(nu)A^{\mu^{\prime}}=\frac{\partial x^{\mu^{\prime}}}{\partial x^{\nu}} A^{\nu}. ^(9){ }^{9} Note that if we have a connection Gamma^(beta)_(alpha gamma)\Gamma^{\beta}{ }_{\alpha \gamma}, this expression can be rewritten in terms of covariant derivatives in an equivalent form
This is strictly optional: the connection is not needed to define the Lie derivative.
We see that the effect of carrying the vector v\boldsymbol{v} is measured by (delu^(beta))/(delx^(alpha))*v^(alpha)(P)\frac{\partial u^{\beta}}{\partial x^{\alpha}} \cdot v^{\alpha}(\mathcal{P}), which is the difference in the velocity field u\boldsymbol{u} measured at the base and tip of the v\boldsymbol{v}, as shown in Fig. 33.5(a)
Step II: Second, we compute the vector v\boldsymbol{v} at Q\mathcal{Q} with components v^(beta)(Q)v^{\beta}(\mathcal{Q}). This is done by seeing how the field v\boldsymbol{v} itself changes between P\mathcal{P} and Q\mathcal{Q} as we move along u\boldsymbol{u}
This is shown geometrically in Fig. 33.5(b).
Step III: The previous two expressions involve the field evaluated at the same point, so can be combined. The Lie derivative is, therefore ^(9){ }^{9}
We use this to formalize our set of rules for finding the Lie derivative of a field.
The Lie derivative of a scalar function with respect to the field u\boldsymbol{u} is defined to be the directional derivative along u\boldsymbol{u} :
{:(33.36)£_(u)f=u[f]=del_(u)f.:}\begin{equation*}
£_{\boldsymbol{u}} f=\boldsymbol{u}[f]=\partial_{\boldsymbol{u}} f . \tag{33.36}
\end{equation*}
The Lie derivative of a vector field v=d//dmu\boldsymbol{v}=\mathrm{d} / \mathrm{d} \mu with respect to u=d//dlambda\boldsymbol{u}=\mathrm{d} / \mathrm{d} \lambda is
Notice from the geometrical interpretation of the Lie derivative that £_(u)v£_{\boldsymbol{u}} \boldsymbol{v} evaluated at point P\mathcal{P} does not just depend on the value of v\boldsymbol{v} at P\mathcal{P}. It also depends on the value of v\boldsymbol{v} at surrounding points. This is unlike the ordinary partial derivative, which only depends on the point P\mathcal{P} in question. As a result, the Lie derivative is not powerful enough to become the derivative for use describing the curvature in manifolds. This is the reason why, ultimately, we must introduce a connection and a covariant derivative.
Example 33.6
Choose a coordinate system where u\boldsymbol{u} is a coordinate basis vector e_(1)=del//delx^(1)\boldsymbol{e}_{1}=\partial / \partial x^{1}, then we have
This suggests a sense in which we can interpret the Lie derivative as a coordinate-free version of the partial derivative.
33.4 Lie derivatives of tensors
We advertised the Lie derivative as working for all tensors. Another object we need to know how to differentiate in this way is therefore a 1 -form. To find this, consider the contraction (: tilde(sigma),w:)\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{w}\rangle and take the Lie derivative with respect to the vector field v\boldsymbol{v}. To do this, we simply use the Leibniz rule ^(10)£_(v)(: tilde(sigma),w:)=(:£_(v)( tilde(sigma)),w:)+(:( tilde(sigma)),£_(v)w:){ }^{10} £_{\boldsymbol{v}}\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{w}\rangle=\left\langle £_{\boldsymbol{v}} \tilde{\boldsymbol{\sigma}}, \boldsymbol{w}\right\rangle+\left\langle\tilde{\boldsymbol{\sigma}}, £_{\boldsymbol{v}} \boldsymbol{w}\right\rangle. Since, for fields, (: tilde(sigma),w:)=f\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{w}\rangle=f, where ff is some scalar field, we have £_(v)(: tilde(sigma),w:)=del_(v)f£_{\boldsymbol{v}}\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{w}\rangle=\partial_{\boldsymbol{v}} f and a result
↷\curvearrowright Section 33.4 gives the technical recipe for applying the Lie derivative to tensors. It can be skipped if you're happy to take the useful expression in eqn 33.46 on trust.
^(10){ }^{10} Using the linearity of the derivative operation, we have for arbitrary tensors, that £_(v)(S ox T)=£_(v)S ox T+S ox£_(v)T£_{v}(S \otimes T)=£_{v} S \otimes T+S \otimes £_{v} T. (33.39) If we contract this, we obtain a Leibniz rule £_(v)(:S,T:)=(:£_(v)S,T:)+(:S,£_(v)T:)£_{\boldsymbol{v}}\langle\boldsymbol{S}, \boldsymbol{T}\rangle=\left\langle £_{v} \boldsymbol{S}, \boldsymbol{T}\right\rangle+\left\langle\boldsymbol{S}, £_{\boldsymbol{v}} \boldsymbol{T}\right\rangle.
(33.40)
example of the use of Killing vectors. Noether's theorem is viscussed in Chapter 40 . ^(11){ }^{11} Wilhelm Killing (1847-1923). Killing invented Lie algebras, independently of Sophus Lie, who was dismissive of Killing's work, claiming that he (Lie) had proven all that was valid in the had proven all that was valid in the
subject, while all that was invalid was subject, while all that was invalid was
added by Killing. Killing was very added by Killing. Killing was very
modest about his own work, which was modest about his own work, which was
arguably intended to be more general than Lie's. In it, Killing made a number of unproven conjectures, which only later turned out to be true. ^(12){ }^{12} Note that Killing's equation can also be written in the memorable shorthand form xi_((alpha;beta))=0\xi_{(\alpha ; \beta)}=0.
After all of this setting up, we should mention some uses of the Lie derivative. The Lie derivative is the natural derivative for expressing the invariance of a tensor under a change in position along a curve. The Lie derivative arises most often in relativity in cases where we imagine ourselves carried along a particular world line while making measurements. We often use comoving coordinates in which we float along with an element of fluid, as in cosmology, for example. Another use arises when examining geodesic deviation. The vector linking the two geodesics has zero Lie derivative along the velocity field of the geodesics.
33.5 Killing vectors
A particularly important class of vectors are those that represent conserved quantities. In mechanics and field theory, Noether's theorem tells us that wherever there's a symmetry then we have a conserved quantity. There is also a geometric method of extracting the conserved quantities from the metric that relies on the use of the Lie derivative and which is particularly useful in relativity.
Geometrically, conserved quantities can be found by identifying Killing vectors. ^(11){ }^{11} We say that a geometry has an isometry if we can identify a vector field xi\boldsymbol{\xi} with the property that if a set of points is displaced along the streamlines of xi\boldsymbol{\xi}, then all distance relationships are unchanged. This vector field xi\boldsymbol{\xi} is then a Killing vector field for the geometry. Distance relationships are encoded by the metric tensor, and so we want the metric tensor to remain unchanged as we carry it along the streamlines of xi\boldsymbol{\xi}. This implies we have
and so, if the metric tensor is Lie dragged along the congruence formed by the field xi\boldsymbol{\xi}, then xi\boldsymbol{\xi} is a Killing field.
When we have access to a connection, the condition £_(xi)g=0£_{\boldsymbol{\xi}} \boldsymbol{g}=0 implies that xi\boldsymbol{\xi} satisfies Killing's equation ^(12){ }^{12}
Often Killing vectors can be written down by inspection of the form of the metric. From Example 33.6, we see that if xi\boldsymbol{\xi} were to be, say, the basis vector e_(1)\boldsymbol{e}_{1}, then we have
and so the metric components can't depend on coordinate x^(1)x^{1}. Therefore, a metric that is independent of x^(1)x^{1} has a Killing vector e_(1)\boldsymbol{e}_{1}. This is the most straightforward way to find Killing vectors: simply see which coordinates do not feature in any of the components of the metric and identify the corresponding basis vectors. Finally, we need to know how to find conserved quantities. Here's the rule:
The dot product u*xi\boldsymbol{u} \cdot \boldsymbol{\xi} of a Killing field xi\boldsymbol{\xi} and tangent u\boldsymbol{u} to a geodesic is a constant of the motion along that geodesic.
Example 33.10
This latter claim is easily proved by using the covariant derivative grad_(u)\nabla_{u} to find the change along the geodesic of the dot product grad_(u)(u*xi)\nabla_{\boldsymbol{u}}(\boldsymbol{u} \cdot \boldsymbol{\xi}). Using the Leibniz rule, we have
By definition, a geodesic parallel transports its own tangent vector, meaning grad_(u)u=\nabla_{u} u= 0 , and so the first term is zero. The second term is
Since we're contracting both indices against the components of u\boldsymbol{u} then this last term is symmetric in alpha\alpha and beta\beta, allowing us to write
where the final equality follows from Killing's equation xi_((alpha;beta))=0\xi_{(\alpha ; \beta)}=0. We conclude that the quantity u*xi\boldsymbol{u} \cdot \boldsymbol{\xi} is unchanged along the geodesic.
Let's now use this toolkit to extract some Killing vectors.
Example 33.11
The Schwarzschild metric has a line element
As we saw in Chapter 22, this leads to conserved quantities along geodesics of
where u\boldsymbol{u} is tangent to the geodesics.
Example 33.12
Consider Rindler spacetime with metric line element ds^(2)=-x^(2)dt^(2)+dx^(2)\mathrm{d} s^{2}=-x^{2} \mathrm{~d} t^{2}+\mathrm{d} x^{2}. Labelling coordinates (t,x)(t, x), we see that the coordinate tt does not feature in any of the components of the metric. The vector xi=e_(t)\boldsymbol{\xi}=\boldsymbol{e}_{t} is therefore the Killing vector. This implies that we have a conserved quantity along the geodesic of u*xi=u_(t)\boldsymbol{u} \cdot \boldsymbol{\xi}=u_{t} or
The Lie derivative is a method of taking derivatives of a tensor field. It requires an additional vector field u\boldsymbol{u}.
The Lie derivative of a vector v\boldsymbol{v} with respect to u\boldsymbol{u} is £_(u)v=[u,v]£_{\boldsymbol{u}} \boldsymbol{v}=[\boldsymbol{u}, \boldsymbol{v}].
Killing vectors allow us access to conserved quantities in geometrical theories.
Along a geodesic with a tangent vector field u\boldsymbol{u}, the quantity u*xi\boldsymbol{u} \cdot \boldsymbol{\xi} is conserved, where xi\boldsymbol{\xi} is a Killing vector.
Exercises
(33.1) Consider the amount by which the figure fails to close in Example 33.3. We shall write the distance between initial and final points as
where u(P_(i))\boldsymbol{u}\left(\mathcal{P}_{i}\right) is the vector field u\boldsymbol{u} evaluated at point P_(i)\mathcal{P}_{i}. Show that this expression yields the commutator of the vector fields u\boldsymbol{u} and v\boldsymbol{v}, evaluated at P_(0)\mathcal{P}_{0}.
(33.2) (a) Determine the components of the Lie derivative of the (1,1)(1,1) tensor A\boldsymbol{A}, written (£_(u)A)_(nu)^(mu)\left(£_{u} \boldsymbol{A}\right)_{\nu}^{\mu}.
(b) Contraction can be thought of as multiplication by the Kronecker delta delta^(mu)_(nu)=(:omega^(mu),e_(nu):)\delta^{\mu}{ }_{\nu}=\left\langle\boldsymbol{\omega}^{\mu}, \boldsymbol{e}_{\nu}\right\rangle. By
taking a Lie derivative of delta^(mu)_(nu)\delta^{\mu}{ }_{\nu}, show that Lie differentiation commutes with contraction.
(33.3) Show that
by acting with this combination on (i) a scalar function and (ii) a vector field.
Hint: For (ii), use the Jacobi identity [x,[y,z]]+[\boldsymbol{x},[\boldsymbol{y}, \boldsymbol{z}]]+[z,[x,y]]+[y,[z,x]]=0[\boldsymbol{z},[\boldsymbol{x}, \boldsymbol{y}]]+[\boldsymbol{y},[\boldsymbol{z}, \boldsymbol{x}]]=0.
(33.4) Using the results of the last exercise, show that the commutator of two Killing fields is a Killing field.
Geometry of the connection
Difficult you call it, Sir? I wish it were impossible. Samuel Johnson (1709-1784) on hearing a famous violinist
To understand the physics of relativity we need the notion of a derivative to encode rates of change. So far in this part of the book, we have encountered two derivatives: the exterior derivative and the Lie derivative. The exterior derivative is designed to work on forms only. The Lie derivative, which works on all tensors, depends on the behaviour of a field at two points, which is unlike the partial derivative in ordinary calculus. Neither of these derivatives is particularly satisfactory to describe the physics of gravitation. In order to have a satisfactory derivative, we need to add a little more structure to our primitive spaces (or manifolds). The solution is found by adding the notion of parallelism to our toolkit. ^(1){ }^{1} In order to compare two vectors at different points in space, we must have the ability to parallel transport one to the other's location. Being able to keep a vector parallel requires that we must be able to set up basis vectors and 1-forms at each point and, somehow, to connect them. This then allows us to form the covariant derivative that compares a vector at a point to its parallel transported counterpart from another point. It is this derivative that is the most useful for formulating general relativity.
We therefore define ^(2){ }^{2} a connection, with symbol grad\boldsymbol{\nabla}. This is not a tensor itself, but can be applied to any tensor field and, like the conventional derivative, only takes in the properties of space at a single point. Like the exterior derivative d\boldsymbol{d}, the connection can be thought of as a (0,1)(0,1) object. In fact, the exterior derivative d\boldsymbol{d} and the connection grad\boldsymbol{\nabla} are identical in their action on scalars. ^(3){ }^{3} As we shall see, the introduction of grad\boldsymbol{\nabla}, which could be viewed as an upgrade of the exterior derivative dd, allows access to an efficient means of extracting the all-important curvature of a spacetime.
The connection can be used to measure the change in a quantity on being transported along a vector u\boldsymbol{u}. The connection directed along u\boldsymbol{u} is u*grad\boldsymbol{u} \cdot \boldsymbol{\nabla}, which is the covariant derivative and given the symbol grad_(u)\boldsymbol{\nabla}_{\boldsymbol{u}}. The covariant derivative tells us to move along the integral curves of the field u\boldsymbol{u}, comparing tensors as we go, using parallel transport to account for any changes in coordinate system at different points in space. Rules for the workings of the connection are given in the margin.
4.1 Covariant derivative in pictures 352
34.2 Connection and exterior derivative and exterior
34.3 Covariant derivative of tensors ^(1){ }^{1} We shall see in more detail in this chapter how parallelism relies on having a metric. This fundamentally links the covariant derivative and the metric. Since general relativity of the physics of the metric field, this explains why the covariant derivative features so heavily in the description of the mathematics of relativity. ^(2){ }^{2} The abstract treatment of the covariant derivative grad_(u)\boldsymbol{\nabla}_{\boldsymbol{u}} was developed by Jean-Louis Koszul (1921-2018). The operator grad\boldsymbol{\nabla} wasn't employed in anger until around 1954 in a paper by Katsumi Nomizu (1924-2008) who calls it tt rather than grad\nabla ^(3){ }^{3} When a connection is present, the action of d\boldsymbol{d} and grad\boldsymbol{\nabla} on vectors, or indeed on any ( n,0n, 0 ) tensor, will be identical. However, d\boldsymbol{d} and grad\boldsymbol{\nabla} are not identical in their action on forms
In order for the connection to only use the information at a single point, we require linearity in the direction along which we point it: grad_(au+bv)w=agrad_(u)w+bgrad_(v)w\boldsymbol{\nabla}_{a \boldsymbol{u}+b \boldsymbol{v}} \boldsymbol{w}=a \boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{w}+b \boldsymbol{\nabla}_{\boldsymbol{v}} \boldsymbol{w}. (34.1)
We also want it to be linear in the argument on which it operates
Finally, we want the connection to obey a Leibniz product rule grad(S ox T)=(grad S ox T)+(S ox grad T)\boldsymbol{\nabla}(\boldsymbol{S} \otimes \boldsymbol{T})=(\boldsymbol{\nabla} \boldsymbol{S} \otimes \boldsymbol{T})+(\boldsymbol{S} \otimes \boldsymbol{\nabla} \boldsymbol{T}). ^(4){ }^{4} The ( 1,2 ) torsion tensor tau\boldsymbol{\tau} along with the ( 1,3 ) Riemann curvature tensor R\boldsymbol{R} are two independent characteristic features of a connection. The torsion-free (tau=0)(\boldsymbol{\tau}=0) spaces we consider in general relativity all share the property of symmetry of the connection
Fig. 34.1 After the vectors u\boldsymbol{u} and v\boldsymbol{v} have been transported, the quadrilateral fails to close. ^(5){ }^{5} The symbol grad_(u)v\nabla_{u} v tells us to transport the vector v\boldsymbol{v} along the vector field u\boldsymbol{u} and measure the difference in v\boldsymbol{v}. We could, of course, transport the vector field u\boldsymbol{u} along the vector v\boldsymbol{v} by considering grad_(v)u\boldsymbol{\nabla}_{\boldsymbol{v}} \boldsymbol{u}.
34.1 Covariant derivative in pictures
In the sorts of manifolds with which we work in general relativity, there is a neat way of visualizing the properties of the covariant derivative.
Although general relativity embraces a geometrical description of gravity, it does so by singling out curvature as the fundamental geometrical object associated with a connection, to the detriment of a quantity called torsion. The torsion of two vector fields u\boldsymbol{u} and v\boldsymbol{v} is a ( 1,2 ) tensor tau\tau defined by
It's certainly worth noting that alternative theories of gravity can be formulated in terms of torsion, which potentially provide a richer phenomenology than the conventional curvature-based theory. However, in the spaces that we conventionally consider in general relativity, the torsion vanishes, ^(4){ }^{4} so we always have the property that
where the second expression is given in terms of components and semicolon notation.
Since the vanishing of torsion allows us to link the connection and the Lie derivative, we can also interpret the connection visually, in terms of whether a figure fails to close, in the same way that we interpreted the Lie derivative in the last chapter.
Example 34.2
As in the last chapter, we evaluate the vector that closes the figure v\boldsymbol{v} - u-v-u\boldsymbol{u}-\boldsymbol{v}-\boldsymbol{u}. As we've seen, this is given by the commutator |v,u|=vu-uv|\boldsymbol{v}, \boldsymbol{u}|=\boldsymbol{v} \boldsymbol{u}-\boldsymbol{u} \boldsymbol{v}. Consider Fig. 34.1, where we use the parameters lambda\lambda and sigma\sigma to parametrize the vector fields v\boldsymbol{v} and u\boldsymbol{u} respectively. The figure is formed from the vector field v\boldsymbol{v} evaluated at parameter point lambda_(0)\lambda_{0} and lambda_(0)+1\lambda_{0}+1, which gives us two different vectors. (They are different vectors, because we've evaluated the vector field at two different points.) We also evaluate the field we've evaluated the vector field at two different points. u\boldsymbol{u} at points sigma_(0)\sigma_{0} and sigma_(0)+1\sigma_{0}+1 and use these four vectors to form the figure. We see that u\boldsymbol{u} at points sigma_(0)\sigma_{0} and sigma_(0)+1\sigma_{0}+1 and use these four ve
the figure does not close. Using the fact that ^(5){ }^{5}
we see that the amount by which it fails to close is given by grad_(u)v-grad_(v)u\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{v}-\boldsymbol{\nabla}_{\boldsymbol{v}} \boldsymbol{u}. It follows that grad_(u)v-grad_(v)u=[u,v]\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{v}-\boldsymbol{\nabla}_{\boldsymbol{v}} \boldsymbol{u}=[\boldsymbol{u}, \boldsymbol{v}].
For cases where the commutator [u,v][\boldsymbol{u}, \boldsymbol{v}] vanishes (i.e. when the quadrilateral closes), we have the useful property of the symmetry of the covariant derivative which says
{:(34.8)grad_(u)v=grad_(v)u:}\begin{equation*}
\nabla_{u} v=\nabla_{v} u \tag{34.8}
\end{equation*}
34.2 Connection and exterior derivative
How do we understand the connection symbol grad\boldsymbol{\nabla} ? That is to say, what is the meaning of a covariant derivative without the specification of a direction along which to take the derivative? The connection grad\boldsymbol{\nabla} is much like the exterior derivative d\boldsymbol{d}, and is defined on a scalar function via
{:(34.9)grad f=df:}\begin{equation*}
\nabla f=\boldsymbol{d} f \tag{34.9}
\end{equation*}
↷" This "_(" some formal section introduces that will ")\underset{\text { some formal section introduces that will }}{\curvearrowright \text { This }} some formal ideas that will be taken up in Chapter 36 to formulate a highly effient method for extract
ture from a metric.
The resulting quantity is a 1 -form. We can give it a direction using (:df,u:)=del_(u)f\langle\boldsymbol{d} f, \boldsymbol{u}\rangle=\partial_{\boldsymbol{u}} f, or equivalently
{:(34.10)u*grad f=grad_(u)f-=u^(mu)(del f)/(delx^(mu))-=del_(u)f:}\begin{equation*}
\boldsymbol{u} \cdot \nabla f=\nabla_{u} f \equiv u^{\mu} \frac{\partial f}{\partial x^{\mu}} \equiv \partial_{\boldsymbol{u}} f \tag{34.10}
\end{equation*}
where we've written the results in several forms of notation. ^(6){ }^{6}
Previously, we confined the use of the differential operator d\boldsymbol{d} to forms [i.e. antisymmetric (0,n)(0, n) objects] and, in this spirit, we used it above on a function, which is a 0 -form. We now extend the use of this tool to vectors, which are (1,0)(1,0) objects. When acting on vectors, the action of d\boldsymbol{d} and grad\boldsymbol{\nabla} are identical. We interpret the action of d\boldsymbol{d} and grad\boldsymbol{\nabla} on a vector as saying that the object is vector-valued [that is, there's a 1 in the first slot of the valency (1,0)(1,0) ], but that it is also a 0 -form [that is, the 0 part of (1,0)(1,0) ]. Crucial here is having a connection available. In its absence, the action of d\boldsymbol{d} on a vector is undefined.
In component notation, the vector v\boldsymbol{v} is written as v^(mu)e_(mu)v^{\mu} \boldsymbol{e}_{\mu}. Acting on this with d\boldsymbol{d} gives rise to (i) a contribution from dv^(mu)\boldsymbol{d} v^{\mu}, which is simply the action of d\boldsymbol{d} on a scalar function, and (ii) a contribution from the action on the vector-valued 0 -form e_(mu)\boldsymbol{e}_{\mu}. The covariant derivative of a vector v=v^(mu)e_(mu)\boldsymbol{v}=v^{\mu} \boldsymbol{e}_{\mu} can then be written as
The quantity dv\boldsymbol{d} \boldsymbol{v} is known as a vector-valued 1-form. ^(7){ }^{7} Note that it depends on de_(mu)d e_{\mu}, which tells us how the basis vectors change in space. This quantity gives the nature of the connection. We therefore define ^(8){ }^{8} connection coefficients in terms of the action of d(\boldsymbol{d}( or grad)\boldsymbol{\nabla}) on the basis vectors e_(mu)\boldsymbol{e}_{\mu} as follows:
We can relate this more geometric definition of the connection to our previous notion of connection coefficients (see eqn 7.6) in the following example. ^(6){ }^{6} This doesn't exhaust the notational possibilities. For example, an equivalent description in a more geometrical notation is
{:(34.11)grad_(u)f-=(:df","u:)-=u[f]:}\begin{equation*}
\boldsymbol{\nabla}_{\boldsymbol{u}} f \equiv\langle\boldsymbol{d} f, \boldsymbol{u}\rangle \equiv \boldsymbol{u}[f] \tag{34.11}
\end{equation*}
^(7){ }^{7} Why not simply call this a (1,1)(1,1) tensor? It's more subtle than that. A general (1,1)(1,1) tensor may be written as
However, the vector-valued 1 -form features the term de_(mu)d e_{\mu}, which contains information about how the basis vectors themselves change in space. This relies on the connection, and so the vectorvalued 1 -form requires more structure than is needed to define a standard tensor like S\boldsymbol{S}. ^(8){ }^{8} Previously (Chapter 9), we found the connection coefficients by considering derivatives of the metric, which has been notable by its absence in this part of the book. We shall see at the end of the chapter how parallelism, that gives us the covariant derivative (and therefore these coefficients), relies on having a metric.
Example 34.3
Noting that (:grad A,e_(beta):)=grad_(beta)A\left\langle\boldsymbol{\nabla} \boldsymbol{A}, \boldsymbol{e}_{\beta}\right\rangle=\boldsymbol{\nabla}_{\beta} \boldsymbol{A}, we have grad_(beta)e_(mu)=Gamma^(alpha)_(nu mu)(:(omega^(nu)oxe_(alpha)),e_(beta):)\boldsymbol{\nabla}_{\beta} \boldsymbol{e}_{\mu}=\Gamma^{\alpha}{ }_{\nu \mu}\left\langle\left(\boldsymbol{\omega}^{\nu} \otimes \boldsymbol{e}_{\alpha}\right), \boldsymbol{e}_{\beta}\right\rangle=Gamma^(alpha)_(nu mu)omega^(nu)(e_(beta))oxe_(alpha)=\Gamma^{\alpha}{ }_{\nu \mu} \boldsymbol{\omega}^{\nu}\left(\boldsymbol{e}_{\beta}\right) \otimes \boldsymbol{e}_{\alpha}=Gamma^(alpha)_(nu mu)delta^(nu)_(beta)e_(alpha)=Gamma^(alpha)_(beta mu)e_(alpha)quad=\Gamma^{\alpha}{ }_{\nu \mu} \delta^{\nu}{ }_{\beta} \boldsymbol{e}_{\alpha}=\Gamma^{\alpha}{ }_{\beta \mu} \boldsymbol{e}_{\alpha} \quad (using omega^(nu)(e_(beta))=delta^(nu)_(beta)\boldsymbol{\omega}^{\nu}\left(\boldsymbol{e}_{\beta}\right)=\delta^{\nu}{ }_{\beta} ),
or, picking out components by taking an inner product with omega^(sigma)\boldsymbol{\omega}^{\sigma}
Using the last example and noting that dx^(beta)-=omega^(beta)\boldsymbol{d} x^{\beta} \equiv \boldsymbol{\omega}^{\beta}, we have the final expression,
Note that this can also be written in semicolon notation as grad v=\nabla \boldsymbol{v}=v^(mu)_(;beta)(omega^(beta)oxe_(mu))v^{\mu}{ }_{; \beta}\left(\boldsymbol{\omega}^{\beta} \otimes \boldsymbol{e}_{\mu}\right). That is to say that the components of grad v\boldsymbol{\nabla} \boldsymbol{v} are the covariant derivative components v^(mu)_(;beta)v^{\mu}{ }_{; \beta}. We conclude that from the linear slot machine point of view, the connection symbol grad\boldsymbol{\nabla} acting on a vector v\boldsymbol{v} results in the 2-slotted object
This is the gradient of v\boldsymbol{v}. We summarize the properties of the gradient in the next example.
Example 34.5
The gradient of v\boldsymbol{v} is neither a vector nor a 1-form, but a vector-valued 1-form, sometimes written as dv\boldsymbol{d} \boldsymbol{v}. You must insert both a vector and a 1 -form to get a number. It has components
That is, the components of the connection grad\boldsymbol{\nabla} can be thought of as the connection coefficients.
The quantity grad v(u)=,grad_(u)v\boldsymbol{\nabla} \boldsymbol{v}(\boldsymbol{u})=,\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{v} is a vector. It is the covariant derivative of v\boldsymbol{v} along uu with components
The number that is outputted evaluates the number of times the vector grad_(u)v\nabla_{u} v pierces the surfaces of the 1 -form tilde(sigma)\tilde{\boldsymbol{\sigma}}.
Finally, we shall derive the transformation properties of the connection coefficients.
Example 34.6
Relating primed and unprimed frames, we write
Now using the Leibniz rule and the rule that ^(9)grad_(e){ }^{9} \boldsymbol{\nabla}_{\boldsymbol{e}}f=e_(beta)[f]f=\boldsymbol{e}_{\beta}[f] that we find
The second term is the usual tensor transformation law for the components of a tensor. However, it's the first term that ruins the tensor transformation properties. We can see how if we use a coordinate frame (and recall that Lambda^(alpha)_(beta)=delx^(alpha)//delx^(beta)\Lambda^{\alpha}{ }_{\beta}=\partial x^{\alpha} / \partial x^{\beta} and {:e_(beta)[f]=del f//delx^(beta))\left.\boldsymbol{e}_{\beta}[f]=\partial f / \partial x^{\beta}\right), where we find
The rule for (1,0)(1,0) objects that dv-=grad v\boldsymbol{d} \boldsymbol{v} \equiv \boldsymbol{\nabla} \boldsymbol{v} carries over to all (n,0)(n, 0) tensors, that is, all tensor valued 0 -forms, where we have
{:(34.27)dS=grad S:}\begin{equation*}
d S=\nabla S \tag{34.27}
\end{equation*}
where S\boldsymbol{S} is a (n,0)(n, 0) tensor. It might seem that we can simply make d\boldsymbol{d} equivalent to grad\boldsymbol{\nabla} in its action on all objects. We cannot. This is ^(9){ }^{9} Recall that the square brackets here mean del_(e_(beta))f\partial_{\boldsymbol{e}_{\beta}} f or, in a coordinate basis del f//delx^(beta)\partial f / \partial x^{\beta} and that this quantity is a number. ^(10){ }^{10} We explain how to access the components S^(i_(1),dotsi_(n))_(j_(1)dotsj_(m);k)S^{i_{1}, \ldots i_{n}}{ }_{j_{1} \ldots j_{m} ; k} in the next sections.
^(11){ }^{11} By extension, we have
gradomega^(nu)=-Gamma^(nu)_(beta mu)omega^(nu)oxomega^(beta)\boldsymbol{\nabla} \boldsymbol{\omega}^{\nu}=-\Gamma^{\nu}{ }_{\beta \mu} \boldsymbol{\omega}^{\nu} \otimes \boldsymbol{\omega}^{\beta}.
This should be contrasted with a result we shall establish in Chapter 36 that
^(12){ }^{12} Notice how this expression has minus sign in front of the connection coefficients, in contrast to the vector version.
because the action of d\boldsymbol{d} on a pp-form produces a (p+1)(p+1)-form and not the object whose components are the covariant derivatives. This can be traced back to the fact that the action of d\boldsymbol{d} on pp-forms with p >= 1p \geq 1 involves the antisymmetric wedge product ^^\wedge and not the tensor product ox\otimes. This presents a problem since the action of the connection operator on a tensor does not yield an antisymmetric object, but simply a tensor built from ordinary tensor products. In fact, for a general (n,m)(n, m) tensor S\boldsymbol{S}, we define grad S=S^(i_(1),dotsi_(n))_(j_(1)dotsj_(m);k)(e_(i_(1))ox dots oxe_(i_(n))oxomega^(j_(1))ox dots oxomega^(j_(m))oxomega^(k)),quad\nabla \boldsymbol{S}=S^{i_{1}, \ldots i_{n}}{ }_{j_{1} \ldots j_{m} ; k}\left(\boldsymbol{e}_{i_{1}} \otimes \ldots \otimes \boldsymbol{e}_{i_{n}} \otimes \boldsymbol{\omega}^{j_{1}} \otimes \ldots \otimes \boldsymbol{\omega}^{j_{m}} \otimes \boldsymbol{\omega}^{k}\right), \quad (34.28)
which does not involve the wedge product. ^(10){ }^{10} The point here is that we need a different approach to evaluate the covariant derivative of a pp-form.
Example 34.7
Let's derive the action of the covariant derivative grad_(mu)\nabla_{\mu} on 1-forms. We use the same method we used in the last chapter. This involves combining the 1 -form with a vector via the inner product to make a number. The derivative of the combined vector and 1-form may be found using the Leibniz product rule for tensors.
This product rule for the connection is grad(S ox T)=grad S ox T+(S ox grad T)\boldsymbol{\nabla}(\boldsymbol{S} \otimes \boldsymbol{T})=\boldsymbol{\nabla} \boldsymbol{S} \otimes \boldsymbol{T}+(\boldsymbol{S} \otimes \boldsymbol{\nabla} \boldsymbol{T}). Take S=omega^(nu)\boldsymbol{S}=\boldsymbol{\omega}^{\nu} and T=e_(mu)\boldsymbol{T}=\boldsymbol{e}_{\mu} and replace the ox\otimes operation with (:\langle,:).Thisispermissibleas\rangle . This is permissible as the inner product preserves the linear structure enforced by tensor product ox\otimes. The derivative of (:omega^(nu),e_(mu):)=delta^(nu)_(mu)\left\langle\boldsymbol{\omega}^{\nu}, \boldsymbol{e}_{\mu}\right\rangle=\delta^{\nu}{ }_{\mu} is zero, so we obtain
where, in the second line, we used grade_(mu)=Gamma^(alpha)_(beta mu)omega^(beta)oxe_(alpha)\nabla \boldsymbol{e}_{\mu}=\Gamma^{\alpha}{ }_{\beta \mu} \boldsymbol{\omega}^{\beta} \otimes \boldsymbol{e}_{\alpha}.
From the last example we conclude that ^(11){ }^{11}
As shown in the next example, we can also express the result of applying the exterior derivative to 1 -forms in a coordinate frame in terms of the action of the covariant derivative.
Example 34.8
A 2 -form is obtained by acting on the 1-form tilde(A)=A_(mu)dx^(mu)\tilde{\boldsymbol{A}}=A_{\mu} \boldsymbol{d} x^{\mu} with the exterior derivative operator d\boldsymbol{d}. This is written
We see that we can write the components of the 2 -form d tilde(A)=(1)/(2)(d tilde(A))_(beta gamma)dx^(beta)^^dx^(gamma)\boldsymbol{d} \tilde{\boldsymbol{A}}=\frac{1}{2}(\boldsymbol{d} \tilde{\boldsymbol{A}})_{\beta \gamma} \boldsymbol{d} x^{\beta} \wedge \boldsymbol{d} x^{\gamma}
as
where the square bracket notation for antisymmetrization has been used in the last line. ^(13){ }^{13}
We now extend the action of the covariant derivative from vectors and 1 -forms to tensors in general. Compare the vector v=v^(mu)e_(mu)\boldsymbol{v}=v^{\mu} \boldsymbol{e}_{\mu} to the (2,0) tensor
The vector v\boldsymbol{v} is made up of components v^(mu)v^{\mu} and the basis vectors e_(mu)\boldsymbol{e}_{\mu} and we saw that, when acted on by the derivative operator grad_(alpha)\nabla_{\alpha} (i.e. the covariant derivative in the direction e_(alpha)\boldsymbol{e}_{\alpha} ), we obtain a contribution from the components delv^(mu)//delx^(alpha)\partial v^{\mu} / \partial x^{\alpha}, added to a contribution from the basis vectors of Gamma^(mu)_(alpha beta)v^(beta)\Gamma^{\mu}{ }_{\alpha \beta} v^{\beta}. The tensor T\boldsymbol{T} is very similar, only with an additional contribution from a second basis vector. We therefore obtain an additional connection Gamma\Gamma for this basis vector. The rule for the derivatives of a (2,0)(2,0) tensor are therefore, in component form,
^(13){ }^{13} This expression is the basis of the useful equation for the 1 -form tilde(alpha)\tilde{\boldsymbol{\alpha}} that says (:d tilde(alpha),u^^v:)=(:grad_(u)( tilde(alpha)),v:)-(:grad_(v)( tilde(alpha)),u:)\langle\boldsymbol{d} \tilde{\boldsymbol{\alpha}}, \boldsymbol{u} \wedge \boldsymbol{v}\rangle=\left\langle\nabla_{u} \tilde{\boldsymbol{\alpha}}, \boldsymbol{v}\right\rangle-\left\langle\nabla_{v} \tilde{\boldsymbol{\alpha}}, \boldsymbol{u}\right\rangle.
(34.42)
This can be shown by writing the inner product in components: (alpha_(|gamma;beta|)-alpha_(|beta;gamma|))(u^(beta)v^(gamma)-u^(gamma)v^(beta))\left(\alpha_{|\gamma ; \beta|}-\alpha_{|\beta ; \gamma|}\right)\left(u^{\beta} v^{\gamma}-u^{\gamma} v^{\beta}\right). (34.43)
The details are left as an exercise, and the expression is used in Chapter 43. ↷\curvearrowright Eqn 34.45 was originally claimed in eqn 12.37 in Chapter 12.
Fig. 34.2 A manifold with a metric provides enough structure to define an affine connection, and then curvature, and then the Einstein tensor. These are all required for general relativity. ^(14){ }^{14} The Riemann tensor will be revisited in Chapter 35. ^(15){ }^{15} This statement is the final word on what the spacetime of general relativity is, from the mathematical point of view. For those interested in the history, Jean le Rond d'Alembert (17171783) was probably the first person to 1783) was probably the first person to
express time as the fourth dimension and thereby invent the notion of spaceand thereby invent the notion of space-
time. This advance was attributed to time. This advance was attributed to
Lagrange by E. T. Bell, but there is litLagrange by E. T. Bell, but there is lit
tle evidence for this. See R. G. Van Oss, Historia Mathematica 10, 455 (1983) for a discussion.
Notice how the connection-coefficients are contracted against each tensor index in turn.
Since basis 1 -forms make a-Gamma\mathrm{a}-\Gamma contribution, we similarly expect a contribution of one of these for each basis 1 -form. So, for the (0,2)(0,2) tensor xi\boldsymbol{\xi} we have
Again, we must contract the connection coefficients against each down index.
Generalizing, the routine for a general (m,n)(m, n) tensor A\boldsymbol{A} is simply to add additional Gamma\Gamma s for each up index and subtract them for each down index:
{:[(grad_(alpha)A)_(beta gamma delta dots kappa)^(mu nu xi dots omega)=(delA_(beta gamma delta dots kappa)^(mu nu xi dots omega))/(delx^(alpha))+Gamma_(alpha lambda)^(mu)A_(beta gamma delta dots kappa)^(lambda nu xi dots omega)+Gamma^(nu)_(alpha lambda)A_(beta gamma delta dots kappa)^(mu lambda xi dots omega)],[+Gamma(" terms for all up indices ")],[-Gamma^(lambda)_(alpha beta)A_(lambda gamma delta dots kappa)^(mu nu xi dots omega)-Gamma^(lambda)_(alpha gamma)A_(beta lambda delta dots kappa)^(mu nu xi dots omega)],[(34.47)-Gamma(" terms for all down indices ").]:}\begin{align*}
\left(\nabla_{\alpha} A\right)_{\beta \gamma \delta \ldots \kappa}^{\mu \nu \xi \ldots \omega}= & \frac{\partial A_{\beta \gamma \delta \ldots \kappa}^{\mu \nu \xi \ldots \omega}}{\partial x^{\alpha}}+\Gamma_{\alpha \lambda}^{\mu} A_{\beta \gamma \delta \ldots \kappa}^{\lambda \nu \xi \ldots \omega}+\Gamma^{\nu}{ }_{\alpha \lambda} A_{\beta \gamma \delta \ldots \kappa}^{\mu \lambda \xi \ldots \omega} \\
& +\Gamma(\text { terms for all up indices }) \\
& -\Gamma^{\lambda}{ }_{\alpha \beta} A_{\lambda \gamma \delta \ldots \kappa}^{\mu \nu \xi \ldots \omega}-\Gamma^{\lambda}{ }_{\alpha \gamma} A_{\beta \lambda \delta \ldots \kappa}^{\mu \nu \xi \ldots \omega} \\
& -\Gamma(\text { terms for all down indices }) . \tag{34.47}
\end{align*}
This completes the description of the connection and covariant derivative. Of course, the reason we need all of this formalism is that grad_(v)\nabla_{v} is the derivative that is most useful in describing the curvature of spacetime encoded into the metric field. We make this link between the covariant derivative to the metric in the next and final section of this chapter.
34.4 The metric revisited
The central pillar of our geometric theory of Nature is the metric, the fundamental field of the theory of general relativity. Both the connection grad\boldsymbol{\nabla} and the Riemann tensor ^(14)R{ }^{14} \boldsymbol{R} can be thought of as structures that derive from the metric g\boldsymbol{g} defined on the manifold M\mathcal{M}, and these structures underpin general relativity (this idea is described schematically in Fig. 34.2). To specify these essential ingredients we write (M,g)(\mathcal{M}, \boldsymbol{g}). A mathematical statement of general relativity is then as follows: ^(15){ }^{15}
Spacetime is the manifold M\mathcal{M} on which there is a Lorentz metric g\boldsymbol{g}. The curvature of spacetime, described by g\boldsymbol{g}, is related to the distribution of matter in spacetime by the Einstein equation.
As we've seen previously in this book, the metric tensor is defined mathematically to be a non-singular (0,2)(0,2) tensor with the property g(e_(mu),e_(nu))=g_(mu nu)\boldsymbol{g}\left(\boldsymbol{e}_{\mu}, \boldsymbol{e}_{\nu}\right)=g_{\mu \nu}. The definition implies that the metric tensor can be built from the tensor product of basis 1-forms: g=g_(mu nu)omega^(mu)oxomega^(nu)\boldsymbol{g}=g_{\mu \nu} \boldsymbol{\omega}^{\mu} \otimes \boldsymbol{\omega}^{\nu}, or, introducing some alternative notation, we can write things in terms of a line element tensor, ds^(2)=g_(mu nu)dx^(mu)ox dx^(nu)\boldsymbol{d} s^{2}=g_{\mu \nu} \boldsymbol{d} x^{\mu} \otimes \boldsymbol{d} x^{\nu}. Written in slot machine form, we have
With the metric in our toolkit, we can show how possessing a connection grad\boldsymbol{\nabla} allows us access to the notion of parallelism and parallel transport. One possible version of parallel transport might be that the tangent vector u\boldsymbol{u} to a geodesic, parametrized by lambda\lambda, is moved along the curve and simply returns a vector proportional to that tangent vector ^(16){ }^{16}
A connection that obeys this, very general, statement of parallelism is called a non-affine connection. Such connections are characterized by parametrizations where lambda\lambda does not mark of regular intervals along the curve, in Fig 34.3 (bottom). When a connection is non-affine the vector u\boldsymbol{u} can get longer or shorter as it moves around, but always remains parallel to itself.
In contrast, an affine connection, where lambda\lambda marks off regular intervals along the path, is characterized by the much more restrictive statement of parallelism that, for a geodesic parametrized by lambda\lambda, we have ^(17){ }^{17}
which we recognize as the geodesic equation. When this condition holds we have the state of affairs we have previously described as parallel transport. The tangent vector is transported along the geodesic remaining parallel to itself and, crucially, its length remains constant (Fig 34.3, top). Tangent vectors of constant length are something we certainly want from our physical theory, since we require, for example, that the velocity vector u\boldsymbol{u}, which is tangent to the world line of a massive particle, has a constant magnitude such that u*u=-1\boldsymbol{u} \cdot \boldsymbol{u}=-1.
It is the metric that guarantees that the connection is affine. As a consequence, we must now permanently fix the connection and the metric together. This link between connection and metric is made with the compatibility condition ^(18){ }^{18}
This equation implies that the covariant derivative grad_(u)g=0\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{g}=0 when taken along a curve with tangent vector u\boldsymbol{u}. As we show in the example below, this condition is enough to ensure that the length of any vector is a constant when it is parallel transported.
Example 34.9
The vectors v\boldsymbol{v} and w\boldsymbol{w} are parallel transported along the curve whose tangent is u\boldsymbol{u}. That is
If the lengths of the vectors are unchanged by this operation then the inner product g(v,w)\boldsymbol{g}(\boldsymbol{v}, \boldsymbol{w}) cannot change on parallel transportation. As a result we have
^(16){ }^{16} As in Part II, we again use the D//dlambda\mathrm{D} / \mathrm{d} \lambda notation here to denote a covariant derivative taken along the curve parametrized by lambda\lambda in a spacetime with a connection. ^(17){ }^{17} In Exercise 34.7, we show that if a parametrization lambda\lambda that obeys this equation, then an alternative parametrization s=a lambda+bs=a \lambda+b, with aa and bb constants, also obeys the equation.
Fig. 34.3 Affine and non-affine connections. ^(18){ }^{18} We note that there is an approach to gravity where the connection and the metric are treated as independent variables, with respect to which the Einstein-Hilbert action is varied (see Chapter 40). This is known as Palatini gravity. However, if the action is given by eqn 40.36 , formed from the (not necessarily metric-compatible) connection, then the two versions of gravity are equivalent.
or, written in components
Using the Leibniz rule: quadu^(alpha)(g_(beta gamma)v^(beta)w^(gamma))_(;alpha)=0\quad u^{\alpha}\left(g_{\beta \gamma} v^{\beta} w^{\gamma}\right)_{; \alpha}=0. quadu^(alpha)g_(beta gamma;alpha)v^(beta)w^(gamma)+u^(alpha)g_(beta gamma)v^(beta)_(;alpha)w^(gamma)+u^(alpha)g_(beta gamma)v^(beta)w^(gamma)_(;alpha)=0\quad u^{\alpha} g_{\beta \gamma ; \alpha} v^{\beta} w^{\gamma}+u^{\alpha} g_{\beta \gamma} v^{\beta}{ }_{; \alpha} w^{\gamma}+u^{\alpha} g_{\beta \gamma} v^{\beta} w^{\gamma}{ }_{; \alpha}=0.
The components of grad g\boldsymbol{\nabla} \boldsymbol{g} are all zero therefore, and so this object vanishes.
As a special case, we also see that for a geodesic with tangent u\boldsymbol{u}, then fixing g_(beta gamma;alpha)=0g_{\beta \gamma ; \alpha}=0 and g(u,u)=-1\boldsymbol{g}(\boldsymbol{u}, \boldsymbol{u})=-1, ensures the geodesic equation grad_(u)u-=u^(nu)u^(mu)_(;nu)=0\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{u} \equiv u^{\nu} u^{\mu}{ }_{; \nu}=0 is obeyed, as required. ^(19){ }^{19}
This makes the crucial link between the connection and the metric. The condition grad g=0\boldsymbol{\nabla} \boldsymbol{g}=0 (along with the vanishing of torsion) is enough the fix the action of grad\nabla as the operator giving an affine connection. The reason that the covariant derivative is the suitable derivative for relativity is because it is the derivative that is built on the foundation of parallelism, which involved having access to a metric, and general relativity is the field theory of the metric field.
On a more practical note, metric components are useful in raising and lowering indices. The semicolon notation provides the means for doing this in expressions involving the covariant derivative, as we examine below.
Example 34.10
We act on the components of the covariant derivative with the metric components. We have
This last equation causes us to pause since Gamma\Gamma is not a tensor in its latter two down components, and so we cannot simply raise an index as we might be tempted to. This isn't a problem if we simply follow eqn 34.35 and write g_(mu nu)v^(mu)_(;alpha)=v_(nu;alpha)=v_(nu,alpha)-Gamma^(gamma)_(alpha nu)v_(gamma)g_{\mu \nu} v^{\mu}{ }_{; \alpha}=v_{\nu ; \alpha}=v_{\nu, \alpha}-\Gamma^{\gamma}{ }_{\alpha \nu} v_{\gamma},
which gives us a way to interpret the final line in eqn 34.58 .
Chapter summary
The connection grad\boldsymbol{\nabla} allows a derivative to be defined that is satisfactory to describe tensor fields in the curved spacetimes of general relativity.
The components of grad\nabla are the connection coefficients Gamma^(mu)_(alpha beta)\Gamma^{\mu}{ }_{\alpha \beta}.
An affine connection is guaranteed by the condition grad g=0\nabla \boldsymbol{g}=0.
Exercises
(34.1) Use the symmetry of the covariant derivative to show
{:(34.60)grad_(u+n)v=grad_(u)v+grad_(n)v:}\begin{equation*}
\nabla_{u+n} v=\nabla_{u} v+\nabla_{n} v \tag{34.60}
\end{equation*}
(34.2) (a) Using the definition e_(alpha)*e_(beta)=g_(alpha beta)\boldsymbol{e}_{\alpha} \cdot \boldsymbol{e}_{\beta}=g_{\alpha \beta}, along with the action of grad\boldsymbol{\nabla} on basis vectors, to compute the derivative grad(e_(alpha)*e_(beta))\boldsymbol{\nabla}\left(\boldsymbol{e}_{\alpha} \cdot \boldsymbol{e}_{\beta}\right), and use this to show
that links the connection coefficients to the derivatives of the components of g\boldsymbol{g}.
(34.3) (a) For an arbitrary matrix M_\underline{\boldsymbol{M}} there is an identity
Prove this by considering a variation in ln(detM_)\ln (\operatorname{det} \underline{\boldsymbol{M}}) owing to a variation deltax^(lambda)\delta x^{\lambda} that gives
(34.4) Using components, take the covariant derivative grad_(u)\boldsymbol{\nabla}_{u} of an inner product of vector and 1-form (: tilde(sigma),v:)=f\langle\tilde{\boldsymbol{\sigma}}, \boldsymbol{v}\rangle=f, where ff is a function, and use this to prove the rule for the components of the covariant derivative of a 1 -form.
(34.5) (a) If lambda\lambda is an affine parameter such that u^(alpha)=(dx^(alpha))/(d)u^{\alpha}=\frac{\mathrm{d} x^{\alpha}}{\mathrm{d}}, show that
(b) Using this, show that if the metric functions g_(mu nu)g_{\mu \nu} are independent of a coordinate x^(1)x^{1} then u_(1)u_{1} is a constant along the particle's geodesic world line. This amounts to the rule for finding Killing vectors that we discussed in the last chapter.
(34.6) The exterior derivative of a 1 -form
Consider the number formed by filling the slots of a 2-form as follows: d tilde(A)(u,v)\boldsymbol{d} \tilde{\boldsymbol{A}}(\boldsymbol{u}, \boldsymbol{v}), where tilde(A)\tilde{\boldsymbol{A}} is a 1-form and u\boldsymbol{u} and v\boldsymbol{v} are vectors.
(a) Show that when there is a connection, we can express this quantity as
d tilde(A)(u,v)=grad_(u)(: tilde(A),v:)-grad_(v)(: tilde(A),u:)- tilde(A)([u,v])d \tilde{A}(u, v)=\nabla_{u}\langle\tilde{A}, v\rangle-\nabla_{v}\langle\tilde{A}, u\rangle-\tilde{A}([u, v])
(34.70)(34.70)
This suggests another route to eqn 34.69. Since we know that the action of the exterior and the covariant derivative on a scalar are identical, we could have started by considering the candidate quantity grad_(u)(: tilde(A),v:)\nabla_{\boldsymbol{u}}\langle\tilde{\boldsymbol{A}}, \boldsymbol{v}\rangle and then antisymmetrized it. We then expand the resulting candidate equation for dA(u,v)\boldsymbol{d} \boldsymbol{A}(\boldsymbol{u}, \boldsymbol{v}) of grad_(u)(:A,v:)-grad_(v)(:A,u:)\boldsymbol{\nabla}_{\boldsymbol{u}}\langle\boldsymbol{A}, \boldsymbol{v}\rangle-\boldsymbol{\nabla}_{\boldsymbol{v}}\langle\boldsymbol{A}, \boldsymbol{u}\rangle and obtain
Since there should be no dependence on the choice of vectors u\boldsymbol{u} and v\boldsymbol{v}, we would then need to subtract the part of the resulting expression that depends on [u,v][\boldsymbol{u}, \boldsymbol{v}], since this encodes the Lie derivative of the vectors.
(c) Starting from the expression from part (b) that we have now 'derived', show that
Hint: Assume the arbitrary vectors u\boldsymbol{u} and v\boldsymbol{v} in (b) are constant in space.
(34.7) Consider a path parametrized by an affine parameter lambda\lambda. Define a new parametrization s=f(lambda)s=f(\lambda). Show that this makes no difference to the geodesic equation if ss and lambda\lambda are linearly related.
(34.8) Consider a non-geodesic curve in a spacetime with an affine connection. Show that the compatibility condition implies that if the magnitude of the tangent vector u\boldsymbol{u} is a constant along the curve, then u*(grad_(u)u)=0\boldsymbol{u} \cdot\left(\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{u}\right)=0.
Physically, this means that an affine parametriza-
tion that keeps the magnitude of the velocity vector constant for any curve, has a u\boldsymbol{u} that is perpendicular to the acceleration a=(grad_(u)u)\boldsymbol{a}=\left(\boldsymbol{\nabla}_{u} \boldsymbol{u}\right), just as we have in the flat spacetime of special relativity. See Needham (2021) for a discussion of the material in this problem.
Riemann curvature revisited
Abstract
After Riemann had made known his discoveries, mathematicians busied themselves with working out his system of geometrical ideas formally; chief among these were Christoffel, Ricci, and Levi-Civita. Riemann ... clearly left the real development of his ideas in the hands of some subsequent scientist whose genius as a physicist could rise to equal flights with his own as a mathematician. After a lapse of seventy years this mission has been fulfilled by Einstein. Hermann Weyl (1885-1955) Space - Time - Matter
You wake up in a space station and feel a force keeping you in bed. If you can't tell whether this is because there's a gravitational field present or because the station is accelerating, then there's a simple experiment to carry out. Allow two particles to fall freely, starting them in motion on parallel paths, and later measure their relative acceleration. If this acceleration is non-zero it suggests the presence of a genuine curvature in spacetime. This experiment, a measurement of geodesic deviation, is a true test of spacetime curvature and hence of real gravitational fields, as it allows access to the Riemann tensor R\boldsymbol{R}. It is this tensor that provides the key to assessing whether spacetime is curved.
In this chapter, we revisit the Riemann tensor and investigate the geometrical method of calculating R\boldsymbol{R}. The tools we need are the Lie and covariant derivatives from the previous two chapters. The discussion here will lead us to an operator equation for the curvature tensor that is very useful in actually computing the curvature for different spacetimes. ^(1){ }^{1}
35.1 Geodesic deviation (slight return)Chapter summary372
Exercises ..... 372 ✓\checkmark This chapter revisits the Riemann curvature tensor using the geometrical tools we have built in this part of the book, allowing us to rederive book, allowing us to rederive
some key results in a more elegant manner and derive some new ones. This material pronew ones. This material pro-
vides an insight into the role vides an insight into the role
curvature plays in the physics curvature plays in the physics of the Universe.
35.1 Geodesic deviation (slight return)
In (3+1) dimensions, the Riemann tensor R(,,\boldsymbol{R}(,,,)isa(1,3)) is a (1,3) object that encodes the curvature of spacetime. If we insert three carefully chosen vectors into the latter slots, R\boldsymbol{R} returns a vector whose physical interpretation is that it encodes the relative acceleration of neighbouring geodesics. The properties of the tensor can be derived by considering the deviation of two freely falling particles following separate geodesics, much as in the thought experiment described above. This is our first task in this chapter.
Fig. 35.1 Geodesics for different freely falling particles used to compute geodesic deviation. ^(2){ }^{2} This expression is sometimes called the 'geodesic deviation equation', and sometimes called the 'Jacobi equation', reflecting Jacobi's discovery of it in 1837.
The separation of the geodesics is described by the vector n\boldsymbol{n}. This important vector can be understood with reference to Fig. 35.1. A typical geodesic, labelled nn, is parametrized with affine parameter lambda\lambda and has velocity given by the value of the tangent vector field u\boldsymbol{u} evaluated at a point on the geodesic. There will be lots of other geodesics in the vicinity of a particular point lambda_(0)\lambda_{0}. The closest is labelled n+dnn+\mathrm{d} n, the next n+2dnn+2 \mathrm{~d} n and so on. This gives us access to the vector n=d//dnn=\mathrm{d} / \mathrm{d} n that we interpret as the providing a measure of the relative separation of neighbouring geodesics.
Now that we have the separation vector n\boldsymbol{n} we need to evaluate it as the particles fall along their geodesic world lines. The tangents to the particle's geodesics are also provided by the velocity field u\boldsymbol{u} evaluated at the points of interest on the relevant geodesic. (Recall that the geodesics form a congruence of curves for the vector field u\boldsymbol{u}.) The falling of the particles can then be viewed in terms of the vector n\boldsymbol{n} being carried along the streamlines of the velocity field u\boldsymbol{u}. The change in n\boldsymbol{n} is therefore described by the Lie derivative £_(u)n£_{\boldsymbol{u}} \boldsymbol{n}.
Now recall the notion of a field being Lie dragged. This occurs if the vector that stretches between two curves of a congruence still stretches between them after being carried along the congruence. This is exactly the situation we require here to describe the defining property of n\boldsymbol{n}. From the definition of n\boldsymbol{n} being the vector that links the two particles at all points as they fall, we must have the condition that, although n\boldsymbol{n} can change as it the particles fall, its Lie derivative must vanish: £_(u)n=0£_{\boldsymbol{u}} \boldsymbol{n}=0. This encodes the condition that the vector n\boldsymbol{n} still stretches between the particles after being transported through u\boldsymbol{u}. We are now almost ready to derive the most important equation of this chapter. This is the equation that says that relative acceleration of the geodesics is related to the Riemann tensor by ^(2){ }^{2}
We shall provide the geometric derivation of this equation and this will also give us the Riemann tensor.
Example 35.1
Before we get there, it's very useful, as a warm up, to first consider the behaviour of two particles falling in a gravitational field according to Newtonian gravitation. Specifically, let's consider the vector nn, with components n^(k)n^{k}, that connects the trajectories of the two particles. We'll calculate the relative acceleration n^(¨)^(k)\ddot{n}^{k}. This is useful as it closely mirrors the full relativistic derivation that we shall discuss afterwards. The vector of interest is n-=d//dn=(dx^(k)//dn)(del//delx^(k))\boldsymbol{n} \equiv \mathrm{d} / \mathrm{d} n=\left(\mathrm{d} x^{k} / \mathrm{d} n\right)\left(\partial / \partial x^{k}\right). The component n^(k)n^{k} is therefore dx^(k)//dn\mathrm{d} x^{k} / \mathrm{d} n. We therefore want to find
This gives us an equation of motion n^(¨)^(k)+R^(k)_(j)n^(j)=0\ddot{n}^{k}+R^{k}{ }_{j} n^{j}=0, and an expression for the components ^(3)R^(k)_(j){ }^{3} R^{k}{ }_{j} of a tensor that can be used to determine the acceleration of the separation of the trajectories.
Let's now consider the full, geometrical version of the problem considered in the previous example. This follows exactly the same pattern: we simply want to calculate the acceleration of n\boldsymbol{n}. That is to say, we want to find the double derivative of n\boldsymbol{n} with respect to an affine parameter: D^(2)//dlambda^(2)\mathrm{D}^{2} / \mathrm{d} \lambda^{2}, which is equivalent to grad_(u)grad_(u)n\nabla_{\boldsymbol{u}} \nabla_{\boldsymbol{u}} \boldsymbol{n}. From above, we recall that the relative separation vector n\boldsymbol{n} is Lie dragged, which is to say that
We also know that in a torsion-free system grad_(u)n-grad_(n)u=[u,n]\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{n}-\boldsymbol{\nabla}_{\boldsymbol{n}} \boldsymbol{u}=[\boldsymbol{u}, \boldsymbol{n}], so the Lie dragging is equivalently described by the equation
{:(35.7)grad_(u)n=grad_(n)u:}\begin{equation*}
\nabla_{u} n=\nabla_{n} u \tag{35.7}
\end{equation*}
which is the symmetry of the covariant derivative, from the last chapter. We now have the tools at our disposal to find grad_(u)grad_(u)n\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{n}.
Example 35.2\mathbf{3 5 . 2}
Consider a geodesic with tangent vector u\boldsymbol{u}. As usual it is described by the geodesic equation grad_(u)u=0\nabla_{u} \boldsymbol{u}=0. We take the covariant derivative of this expression along the n\boldsymbol{n} direction
Next, we use the commutator [grad_(n),grad_(u)]=grad_(n)grad_(u)-grad_(u)grad_(n)\left[\nabla_{\boldsymbol{n}}, \boldsymbol{\nabla}_{u}\right]=\boldsymbol{\nabla}_{\boldsymbol{n}} \boldsymbol{\nabla}_{\boldsymbol{u}}-\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{\nabla}_{\boldsymbol{n}}, to write
Finally, use the symmetry of the covariant derivative grad_(n)u=grad_(u)n\boldsymbol{\nabla}_{\boldsymbol{n}} \boldsymbol{u}=\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{n} to say
Although this looks like it solves the problem via an operator [grad_(n),grad_(u):}\left[\boldsymbol{\nabla}_{\boldsymbol{n}}, \boldsymbol{\nabla}_{\boldsymbol{u}}\right. ], it does not quite. We discuss this below. ^(4){ }^{4} We define the second derivative grad_(mu)grad_(nu)c\boldsymbol{\nabla}_{\mu} \boldsymbol{\nabla}_{\nu} \boldsymbol{c} to be grad_(mu)(grad_(nu)c)\boldsymbol{\nabla}_{\mu}\left(\boldsymbol{\nabla}_{\nu} \boldsymbol{c}\right) with components c^(alpha)_(;nu mu)=(c^(alpha)_(;nu))_(;mu)c^{\alpha}{ }_{; \nu \mu}=\left(c^{\alpha}{ }_{; \nu}\right)_{; \mu}.
The operator [grad_(a),grad_(b)]\left[\boldsymbol{\nabla}_{\boldsymbol{a}}, \boldsymbol{\nabla}_{\boldsymbol{b}}\right] isn't quite the one we need to compute the Riemann tensor R\boldsymbol{R} to evaluate the curvature of general spaces. We saw above that [u,n]=0[\boldsymbol{u}, \boldsymbol{n}]=0, which implies that a loop formed by vectors u\boldsymbol{u} and n\boldsymbol{n} closes. However, a loop constructed by travelling along arbitrary vectors a\boldsymbol{a} then b\boldsymbol{b} won't necessarily meet up with one where we travel along b\boldsymbol{b} and then along a\boldsymbol{a}. The distance between the end points is measured by the Lie bracket [a,b][\boldsymbol{a}, \boldsymbol{b}]. As a result we actually need the slightly upgraded operator [grad_(a),grad_(b)]-grad_([a,b])\left[\boldsymbol{\nabla}_{\boldsymbol{a}}, \boldsymbol{\nabla}_{\boldsymbol{b}}\right]-\boldsymbol{\nabla}_{[\boldsymbol{a}, \boldsymbol{b}]} to correctly generalize the operator so that we capture the Riemann tensor. In a coordinate frame, we have [e_(mu),e_(nu)]=0\left[\boldsymbol{e}_{\mu}, \boldsymbol{e}_{\nu}\right]=0, simplifying the curvature operator. We shall often work in coordinate frames, making this correction a negligible detail.
The result of this argument is that the curvature tensor R\boldsymbol{R} can be formed into a Riemann curvature operator hat(R)\hat{\boldsymbol{R}} defined by
{:(35.12)R(","c","a","b)= hat(R)(a","b)c=([grad_(a),grad_(b)]-grad_([a,b]))c:}\begin{equation*}
\boldsymbol{R}(, c, a, b)=\hat{\boldsymbol{R}}(\boldsymbol{a}, \boldsymbol{b}) \boldsymbol{c}=\left(\left[\nabla_{a}, \nabla_{b}\right]-\nabla_{[a, b]}\right) c \tag{35.12}
\end{equation*}
The operator needs two vectors (here, a\boldsymbol{a} and b\boldsymbol{b} ) in order to build it. It can then act on a vector (here, c\boldsymbol{c} ) to output another vector. In fact, as shown in the exercises, in a coordinate frame where the loop formed by the basis vectors closes, we end up with the memorable component equation ^(4){ }^{4}
where c^(alpha)c^{\alpha} are the components of a vector.
The Riemann curvature operator turns out to be very useful and we shall work with it in the next chapter where we calculate the tensor for a variety of spacetimes. For now, we conclude that the Riemann curvature results from the double derivative of the vector field u\boldsymbol{u} via
This is simply eqn 35.1 rewritten in terms of covariant derivatives and the Riemann operator. One of the most important things to note about the Riemann tensor and operator is that the curvature is represented by a double covariant derivative. This point will be important in the next chapter when we come to finding an efficient means of computing R\boldsymbol{R}.
35.2 Components of the curvature tensor
Since the curvature tensor R\boldsymbol{R} is a (1,3)(1,3) tensor, inserting the basis vectors gives us an expression for the components R^(alpha)_(beta gamma delta)R^{\alpha}{ }_{\beta \gamma \delta}, which can be interpreted in terms of the connection coefficients Gamma^(alpha)_(mu nu)\Gamma^{\alpha}{ }_{\mu \nu}. To extract the components we write
where we've used the curvature operator in the final expression. Let's now show that this gives the expression for components of R\boldsymbol{R} that we derived in Chapter 11.
Example 35.3
We work in a coordinate basis, so that grad_([e_(alpha),e_(beta)])=0\nabla_{\left[e_{\alpha}, e_{\beta}\right]}=0. We therefore have a curvature operator
Recall that we defined ^(5)grad_(mu)e_(nu)=Gamma^(alpha)_(mu nu)e_(alpha){ }^{5} \nabla_{\mu} \boldsymbol{e}_{\nu}=\Gamma^{\alpha}{ }_{\mu \nu} \boldsymbol{e}_{\alpha}. Acting on e_(nu)\boldsymbol{e}_{\nu} with a term like grad_(alpha)grad_(beta)\nabla_{\alpha} \nabla_{\beta} we obtain ^(6){ }^{6}
Putting this together via R^(alpha)_(beta gamma delta)=(:omega^(alpha),( hat(R))(e_(gamma),e_(delta))e_(beta):)R^{\alpha}{ }_{\beta \gamma \delta}=\left\langle\boldsymbol{\omega}^{\alpha}, \hat{\boldsymbol{R}}\left(\boldsymbol{e}_{\gamma}, \boldsymbol{e}_{\delta}\right) \boldsymbol{e}_{\beta}\right\rangle we have
{:[(35.20)qquadR^(alpha)_(beta gamma delta)=(delGamma^(alpha)_(delta beta))/(delx^(gamma))-(delGamma^(alpha)_(gamma beta))/(delx^(delta))+Gamma^(alpha)_(gamma mu)Gamma^(mu)_(delta beta)-Gamma^(alpha)_(delta mu)Gamma^(mu)_(gamma beta)","],[" just as we found in Chapter 11. We conclude that the Riemann tensor defined "]:}\begin{align*}
& \qquad R^{\alpha}{ }_{\beta \gamma \delta}=\frac{\partial \Gamma^{\alpha}{ }_{\delta \beta}}{\partial x^{\gamma}}-\frac{\partial \Gamma^{\alpha}{ }_{\gamma \beta}}{\partial x^{\delta}}+\Gamma^{\alpha}{ }_{\gamma \mu} \Gamma^{\mu}{ }_{\delta \beta}-\Gamma^{\alpha}{ }_{\delta \mu} \Gamma^{\mu}{ }_{\gamma \beta}, \tag{35.20}\\
& \text { just as we found in Chapter 11. We conclude that the Riemann tensor defined }
\end{align*} geometrically in terms of geodesic deviation is identical to the one we introduced back in Chapter 11.
The geodesic deviation equation gives us a tool to expand on the discussion of the local orthonormal frames from Chapter 10. There we introduced the freely falling frame as one example of a local inertial frame (LIF). In the next example, we shall find a useful set of coordinates that describe another sort of LIF, known as Riemann normal coordinates.
Example 35.4
This idea is most straightforwardly described in flat space. Erect a set of orthonormal axes ^(7)e_(alpha)^{7} \boldsymbol{e}_{\alpha} at the origin and then send straight lines out in every direction. To get to any point P\mathcal{P} in the neighbourhood of the origin we need only pick a straight line and then travel along it through a distance lambda\lambda until we reach P\mathcal{P}. The straight lines can be characterized by their (normalized, unit) gradient vectors u\boldsymbol{u} [Fig. 35.2(a)]. Any point can therefore be reached by translating through a vector lambda u\lambda \boldsymbol{u}. The coordinates of this point zeta^(alpha)\zeta^{\alpha}, can be written as
or zeta^(alpha)=lambdau^(alpha)\zeta^{\alpha}=\lambda u^{\alpha}. As a concrete example, the point (x,y)=(2,1)(x, y)=(2,1) [Fig. 35.2(b)] lies along the line defined by the tangent u=(1)/(sqrt5)(2e_(x)+e_(y))\boldsymbol{u}=\frac{1}{\sqrt{5}}\left(2 \boldsymbol{e}_{x}+\boldsymbol{e}_{y}\right), so that we have
If we simply parametrize the line using the distance, then the point (2,1)(2,1) is found at lambda=sqrt5\lambda=\sqrt{5}, so the coordinates are zeta^(x)=2,zeta^(y)=1\zeta^{x}=2, \zeta^{y}=1. If we choose our parametrization differently, these would be scaled accordingly.
The idea of Riemann normal coordinates is to generalize this procedure to curved space. So we start by erecting a set of orthonormal axes e_(alpha)\boldsymbol{e}_{\alpha} at the origin. Since the generalization of a straight line is a geodesic, we send out geodesics in every direction from the origin [Fig. 35.2(c)]. The geodesics can be timelike or spacelike. They are characterized by their (unit) tangent vectors u\boldsymbol{u} and are parametrized by affine parameter lambda\lambda that tells us where we are on the curve. Select an event at P\mathcal{P} and reach it by travelling a distance lambda\lambda along the geodesic with gradient u\boldsymbol{u}. The coordinates of the point P\mathcal{P} are given the Riemann normal coordinates zeta^(mu)\zeta^{\mu} such that
Fig. 35.2 (a) Unit tangents to geodesics in a flat plane. (b) The point (x,y)=(2,1)(x, y)=(2,1) reached following a unit tangent by a distance given by the parameter lambda\lambda. (c) Geodesics in a curved spacetime. ^(7){ }^{7} We won't give the indices hats in this example to save on clutter. ^(9){ }^{9} Why do we do this? Referring to Fig. 35.1 and using eqn 35.7 we could interpret the first term in eqn 35.14, in the form grad_(u)grad_(u)n Delta a Delta b\nabla_{u} \nabla_{u} \boldsymbol{n} \Delta a \Delta b, as equivalent to the change delta u\delta \boldsymbol{u} due to parallel trans port of uu around a loop with sides n Delta an \Delta a port of uu aroud we know from ides n Deltan \Delta and u Delta b\boldsymbol{u} \Delta b, and we know from eqn 35.14 that delta u+ hat(R)(n,u)u Delta a Delta b=0\delta \boldsymbol{u}+\hat{\boldsymbol{R}}(\boldsymbol{n}, \boldsymbol{u}) \boldsymbol{u} \Delta a \Delta b=0. Here
we extend this idea to describe paralwe extend this idea to describe paral-
lel transport of vector A\boldsymbol{A} around a loop lel transport of vector A\boldsymbol{A} around a loop
with sides u Delta a\boldsymbol{u} \Delta a and v Delta b\boldsymbol{v} \Delta b to obtain the expression delta A+ hat(R)(u,v)A Delta a Delta b=0\delta \boldsymbol{A}+\hat{\boldsymbol{R}}(\boldsymbol{u}, \boldsymbol{v}) \boldsymbol{A} \Delta a \Delta b=0.
Fig. 35.3 Parallel transport of a vector around a loop gives access to the Riemann tensor. The vector is A\boldsymbol{A} origi nally and changes to A^(')=A+delta A\boldsymbol{A}^{\prime}=\boldsymbol{A}+\delta \boldsymbol{A} after transport.
so that we have zeta^(alpha)=lambdau^(alpha)\zeta^{\alpha}=\lambda u^{\alpha}.
In terms of the Riemann normal coordinates, the spacetime near the origin O\mathcal{O} can be shown ^(8){ }^{8} to have have metric components
At the origin we have zeta^(alpha)=0\zeta^{\alpha}=0, and so g_(mu nu)=eta_(mu nu)g_{\mu \nu}=\eta_{\mu \nu}. Since there are no linear terms in zeta^(mu)\zeta^{\mu} in eqn 35.24 we have g_(mu nu,alpha)(O)=0g_{\mu \nu, \alpha}(\mathcal{O})=0, implying that the connection coefficients vanish. We have therefore described a LIF: flat with vanishing connection coefficients.
35.3 Parallel transport again
Using the geometrical concepts of the previous few chapters, we can return to the parallel transport method that we employed back in Chapter 11 to compute the components of the Riemann tensor. There we carried out a rather messy computation that we can simplify enormously with our new machinery. As we saw in Chapter 11, the Riemann tensor measures the curvature via the angle through which a vector changes when it is parallel transported around a closed path. Here we shall use a coordinate-free approach employing the covariant derivative to assess the change delta A\delta \boldsymbol{A} in a vector A\boldsymbol{A}, when transported around the loop shown in Fig. 35.3. ^(9){ }^{9}
We carry a vector A\boldsymbol{A} along a vector u\boldsymbol{u} using the covariant derivative grad_(u)A\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{A}, which computes the value of A\boldsymbol{A} at the tip of u\boldsymbol{u} minus the value at the base. There are two contributions to this directional derivative: the total change in the vector field A\boldsymbol{A} between the start and end points, and the correction due to the change in the basis vectors. The latter is achieved through subtracting off the vector A\boldsymbol{A} parallel transported along the path. Since we are going to close the loop, the first of these contributions is zero (since the field A\boldsymbol{A} is single valued) and we are simply left with the parallel transport correction. As a result, computing the covariant derivative of the vector field A\boldsymbol{A} around a closed loop, outputs (minus) the result of parallel transporting the vector A\boldsymbol{A} around the loop. Referring to Fig. 35.3, we shall show in the next example, that the change in the vector field delta A\delta \boldsymbol{A} on traversing the loop is given by
{:(35.25)delta A+ hat(R)(u","v)A Delta a Delta b=0:}\begin{equation*}
\delta \boldsymbol{A}+\hat{\boldsymbol{R}}(\boldsymbol{u}, \boldsymbol{v}) \boldsymbol{A} \Delta a \Delta b=0 \tag{35.25}
\end{equation*}
where hat(R)\hat{\boldsymbol{R}} is the curvature operator.
Example 35.5
In terms of the covariant derivative, the change -delta A-\delta \boldsymbol{A} is given by traversing the loop in Fig. 35.3 in an anticlockwise direction, which we reorder into contributions:
-delta A=-\delta \boldsymbol{A}=
+grad_(v)A Delta b+\boldsymbol{\nabla}_{\boldsymbol{v}} \boldsymbol{A} \Delta b
(moving along path labelled v Delta b\boldsymbol{v} \Delta b )
-grad_(v)A Delta b-\boldsymbol{\nabla}_{v} \boldsymbol{A} \Delta b
(moving along -v Delta b-\boldsymbol{v} \Delta b )
-grad_(u)A Delta a-\boldsymbol{\nabla}_{u} \boldsymbol{A} \Delta a
(moving along -u Delta a-\boldsymbol{u} \Delta a )
+grad_(u)A Delta a+\boldsymbol{\nabla}_{u} \boldsymbol{A} \Delta a
(moving along u Delta a\boldsymbol{u} \Delta a )
-grad_([u,v])A Delta a Delta b-\boldsymbol{\nabla}_{[\boldsymbol{u}, \boldsymbol{v}]} \boldsymbol{A} \Delta a \Delta b
(moving along -[u,v]Delta a Delta b-[\boldsymbol{u}, \boldsymbol{v}] \Delta a \Delta b ).
-delta A= +grad_(v)A Delta b (moving along path labelled v Delta b )
-grad_(v)A Delta b (moving along -v Delta b )
-grad_(u)A Delta a (moving along -u Delta a )
+grad_(u)A Delta a (moving along u Delta a )
-grad_([u,v])A Delta a Delta b (moving along -[u,v]Delta a Delta b ). | $-\delta \boldsymbol{A}=$ | $+\boldsymbol{\nabla}_{\boldsymbol{v}} \boldsymbol{A} \Delta b$ | | (moving along path labelled $\boldsymbol{v} \Delta b$ ) |
| ---: | :--- | ---: | :--- |
| | $-\boldsymbol{\nabla}_{v} \boldsymbol{A} \Delta b$ | | (moving along $-\boldsymbol{v} \Delta b$ ) |
| | $-\boldsymbol{\nabla}_{u} \boldsymbol{A} \Delta a$ | (moving along $-\boldsymbol{u} \Delta a$ ) | |
| | $+\boldsymbol{\nabla}_{u} \boldsymbol{A} \Delta a$ | | (moving along $\boldsymbol{u} \Delta a$ ) |
| | $-\boldsymbol{\nabla}_{[\boldsymbol{u}, \boldsymbol{v}]} \boldsymbol{A} \Delta a \Delta b$ | (moving along $-[\boldsymbol{u}, \boldsymbol{v}] \Delta a \Delta b$ ). | |
The key here is to recognize that the movement along v Delta b\boldsymbol{v} \Delta b is displaced from the movement along -v Delta b-\boldsymbol{v} \Delta b by a distance u Delta a\boldsymbol{u} \Delta a. We can therefore make the replacement
{:(35.27)grad_(v)A Delta b|_(0)-grad_(v)A Delta b|_(u Delta a)=grad_(u)grad_(v)A Delta a Delta b:}\begin{equation*}
\left.\nabla_{v} A \Delta b\right|_{0}-\left.\nabla_{v} A \Delta b\right|_{u \Delta a}=\nabla_{u} \nabla_{v} A \Delta a \Delta b \tag{35.27}
\end{equation*}
Making an analogous replacement for the displacements along u Delta a\boldsymbol{u} \Delta a we end up with
{:[-delta A=(grad_(u)grad_(v)-grad_(v)grad_(u)-grad_([u,v]))A Delta a Delta b],[(35.28)= hat(R)(u","v)A Delta a Delta b.]:}\begin{align*}
-\delta \boldsymbol{A} & =\left(\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{\nabla}_{\boldsymbol{v}}-\boldsymbol{\nabla}_{\boldsymbol{v}} \boldsymbol{\nabla}_{\boldsymbol{u}}-\boldsymbol{\nabla}_{[\boldsymbol{u}, \boldsymbol{v}]}\right) \boldsymbol{A} \Delta a \Delta b \\
& =\hat{\boldsymbol{R}}(\boldsymbol{u}, \boldsymbol{v}) \boldsymbol{A} \Delta a \Delta b . \tag{35.28}
\end{align*}
This is, of course, exactly what we had in the previous section.
This general discussion of parallel transport gives us yet another insight into what curvature is. If we take two vectors that are orthogonal at a point and parallel transport the pair, then they must remain orthogonal, by definition of parallel transport. This seems like a way to construct an orthogonal Cartesian grid across all spacetime. However, it is just an illusion. Parallel transport is defined along a particular curve, and we have just seen how transporting a vector around a loop of curves leads, in curved space, to the vector rotating. This means that the vector field created through the parallel transport of a vector defined at some point P\mathcal{P} is not single valued. (This is because the vector rotates, giving two vectors when we carry out a loop starting at P\mathcal{P} : the original one and the rotated one.) We can therefore conclude that another description of curvature is the property that prevents us using parallel transport to set up a Cartesian grid across all spacetime.
Finally, in our discussion of the properties of R\boldsymbol{R}, we consider how the symmetries affect the number of independent components that the Riemann tensor carries. ^(10){ }^{10}
Example 35.6
How many independent components does R_(alpha beta gamma delta)R_{\alpha \beta \gamma \delta} have in nn dimensions? Solution: The number of components is
{:(35.32)([" Number of ways of "],[" choosing "alpha beta gamma delta" subject "],[" to pair symmetries "])-((" Number of constraints ")/(" due to cyclic symmetry "))". ":}\left(\begin{array}{c}
\text { Number of ways of } \tag{35.32}\\
\text { choosing } \alpha \beta \gamma \delta \text { subject } \\
\text { to pair symmetries }
\end{array}\right)-\binom{\text { Number of constraints }}{\text { due to cyclic symmetry }} \text {. }
To understand the first term, we note that the antisymmetry of (alpha beta)(\alpha \beta) and (gamma delta)(\gamma \delta) means there are M=(n)/(2)(n-1)M=\frac{n}{2}(n-1) ways of choosing the pairs (alpha beta)(\alpha \beta) and MM of choosing (gamma delta)(\gamma \delta). The symmetry with respect to exchanging the pairs means there are M(M+1)//2M(M+1) / 2 independent choices of pairs or pairs (i.e. a MM-dimensional symmetric matrix A_(ij)A_{i j} has MM diagonal components and (M-1)//2(M-1) / 2 independent off-diagonal ones). We conclude that (M)/(2)(M+1)\frac{M}{2}(M+1) ways of choosing alpha beta gamma delta\alpha \beta \gamma \delta, when we take all of the pair symmetries into account, where M=n(n-1)//2M=n(n-1) / 2.
To understand the second term, we start with the cyclic symmetry which is encoded in the expression
along with a cyclic identity R_(alpha beta gamma delta)+R_(alpha delta beta gamma)+R_(alpha gamma delta beta)=0.(35.31)R_{\alpha \beta \gamma \delta}+R_{\alpha \delta \beta \gamma}+R_{\alpha \gamma \delta \beta}=0 .(35.31) ^(11){ }^{11} Spacetimes with constant curvature have Ricci tenors with components R_(mu nu)=Cg_(mu nu)R_{\mu \nu}=C g_{\mu \nu}, with CC constant. Spacetimes where R_(mu nu)=0R_{\mu \nu}=0 are called Spacetimes where R_(mu nu)=0R_{\mu \nu}=0 are called
Ricci flat. Of course, this latter propRicci flat. Of course, this latter prop-
erty does not mean that all of the components of the Riemann tensor neces sarily vanish; rather that the positive parts of curvature cancel out the negative parts when the Ricci tensor is computed. In this sense, the Ricci tensor is a sort of average of the Riemann tensor. ^(12){ }^{12} The Weyl curvature tensor is invariant with respect to conformal transformation. Its components vanish in Minkowski space (as might be expected) and also in the RobertsonWalker spacetimes. These spaces are therefore sometimes called conformally flat. If the early Universe resembles a Robertson-Walker spacetime then we expect the Weyl curvature to be very small, or zero. As matter clumps together and black holes are formed, the Weyl curvature must increase, diverging at the black hole singularities. However, the idea that at the initial Big-Bang singularity the Weyl curvature is constrained to be small (or zero) is known as the Weyl curvature hypothesis, and is a strong constraint on cosmological models.
This means we need all four indices to be distinct. The number of constraints is then the number of combinations of four objects can be taken from a collection of nn objects, which is
In n=4n=4 dimensions, we have twenty independent components.
After having derived the properties of the Riemann curvature tensor and operator, we shall, in the next chapter, finally calculate its components for a variety of spacetimes. Before we get there we pause to consider the Ricci tensor, which incorporates part of the Riemann tensor into Einstein's equation.
35.4 The meaning of the Ricci tensor
The Ricci tensor is the result of the only contraction of the Riemann tensor that does not vanish. In four spacetime dimensions, it has ten of the twenty independent components of the Riemann tensor. Physically, it can be thought of as that part of the Riemann curvature that causes volumes of matter (i.e. sources of curvature) to shrink. As a consequence of the Einstein equation, the Ricci tensor vanishes in free space. ^(11){ }^{11} In contrast, if we remove the Ricci tensor part from the Riemann tensor, we obtain the Weyl curvature tensor C\boldsymbol{C}, defined as having components
The Weyl tensor contains the other ten independent components of R\boldsymbol{R} and does not generally vanish in the vacuum, in contrast to the Ricci tensor. ^(12){ }^{12} An important physical effect of the Weyl curvature tensor C\boldsymbol{C} is in its role to distort geodesics.
Example 35.7
Since Ricci curvature can sometimes also distort timelike geodesics, it's best to compare the two sorts of curvature in terms of their action on light rays (or null geodesics), The trace-reversed part of the Ricci tensor has components
This part of the curvature causes light rays from a source to focus, just like a positive focusing lens. The part of the curvature captured by the Weyl curvature has the effect of an astigmatic lens, which focuses positively in one plane and defocuses in a perpendicular plane.
Roger Penrose invites us to think about this ^(13){ }^{13} by imagining the action of a transparent non-refracting Sun on rays from distant stars that lie behind it, as shown in Fig. 35.4. Without any gravitational field the light from each star should form a circular image, as shown by the circles drawn with dashed lines. In the presence of the gravitational field from our hypothetical transparent Sun, those rays that pass through the Sun are mostly affected by the Ricci part of the curvature, causing them to be magnified by the positive focusing effect to form a (larger) circular image (shown by the circles drawn with solid lines). On the other hand, stars whose rays pass beyond the rim of the Sun will only experience the Weyl effect, which causes their images to be distorted into ellipses (as shown in the bottom left-hand corner of Fig. 35.4).
We can make these ideas a little more mathematical by considering geodesic deviation.
Notice how the velocity components are contracted against the indices in the second and fourth positions but that we trace over the other two components to make the Ricci tensor.
Consider the matrix K^(mu)_(alpha)=-R^(mu)_(nu alpha beta)u^(nu)u^(beta)K^{\mu}{ }_{\alpha}=-R^{\mu}{ }_{\nu \alpha \beta} u^{\nu} u^{\beta}. If we choose nn to be an eigenvector of this matrix, then Kn=lambda nK \boldsymbol{n}=\lambda \boldsymbol{n}, outputting a parallel vector that is scaled by some amount. If we take the eigenvectors of K\boldsymbol{K} to describe a volume in spacetime, then the trace over K\boldsymbol{K} gives us the rate of change of the volume. ^(14){ }^{14} We have then that
This demonstrates that the Ricci tensor describes the evolution of volumes. The contraction that creates it, which involves tracing over the components of the Riemann tensor, amounts to constructing the response of the volume from the individual lengths that make it up.
Returning the to full geodesic equation D^(2)n^(mu)//dlambda^(2)=K^(mu)_(nu)n^(nu)\mathrm{D}^{2} n^{\mu} / \mathrm{d} \lambda^{2}=K^{\mu}{ }_{\nu} n^{\nu}, we could also consider the fate of a small sphere as we move along a geodesic. We choose geodesics that follow the edges of the sphere to be initially parallel to the geodesic whose tangent is u\boldsymbol{u}. Then geodesic deviation causes the sphere to deform into an ellipsoid, whose axes are the eigenvalues of K\boldsymbol{K}.
This chapter has considered the geometric origin of the Riemann tensor, and we have already said that this tensor derives from the fundamental field of general relativity, the metric field. In the following chapter, we consider the most efficient method to extract the Riemann tensor from a metric. ^(13){ }^{13} This is discussed further in Penrose (2004), Section 28.8.
Fig. 35.4 The magnification and distortion of the images of distant stars due to a transparent Sun. The stars actual sizes are shown by the dashed circles; their images are shown by solid lines. ^(14){ }^{14} To see this consider a twodimensional area A=delta x delta yA=\delta x \delta y where delta x\delta x and delta y\delta y are the eigenfunctions of a linear operator D^(2)//dlambda^(2)= hat(K)\mathrm{D}^{2} / \mathrm{d} \lambda^{2}=\hat{K}. Operating on AA we obtain D^(2)A//dlambda^(2)= hat(K)A\mathrm{D}^{2} A / \mathrm{d} \lambda^{2}=\hat{K} A
Geodesic deviation can be calculated geometrically by considering the condition that the Lie derivative £_(u)n£_{u} n vanishes.
Instead of the Riemann tensor, we can use the Riemann curvature operator hat(R)\hat{\boldsymbol{R}}, defined by
{:(35.42) hat(R)(a","b)c=([grad_(a),grad_(b)]-grad_([a,b]))c.:}\begin{equation*}
\hat{R}(a, b) c=\left(\left[\nabla_{a}, \nabla_{b}\right]-\nabla_{[a, b]}\right) c . \tag{35.42}
\end{equation*}
The Ricci tensor represents the part of the Riemann tensor that has a focussing effect on volumes. The Weyl tensor represents the parts left over, which have a distorting effect on light rays.
Exercises
(35.1) Show that, in terms of coordinates, grad_([X,Y])Z\boldsymbol{\nabla}_{[X, Y]} \boldsymbol{Z} can be represented as
(a) Write the equation in index comma notation. (b) Show that using the comma goes to semicolon rule results in an ambiguity.
(c) Use the result of Exercise 35.2 to show that the two possible forms of the equation can be related using the component of the Ricci tensor R_(mu nu)R_{\mu \nu}. There is no definite rule on how to avoid this ambiguity. Misner, Thorne, and Wheeler give a set of rules of thumb to choose the most physically reasonable.
(35.4) We are now used to setting up a local orthonormal frame at some point in curved spacetime. Here, following the approach of Poisson and Will, we shall set up Riemann normal coordinates zeta^(mu)\zeta^{\mu}, which provide a local inertial frame where, in addition
to the property g_( hat(mu) hat(nu))(O)=eta_( hat(mu) hat(nu))g_{\hat{\mu} \hat{\nu}}(\mathcal{O})=\eta_{\hat{\mu} \hat{\nu}} at the origin O\mathcal{O}, the connection coefficients are also guaranteed to vanish. To do this, we shall work with the basis vectors e_(mu)\boldsymbol{e}_{\mu} of an orthonormal frame defined at the origin. (We won't follow our usual convention of gives these directions hats in this case, to save on clutter.)
Consider the setup for the Riemann normal coordinates zeta^(mu)=lambdau^(mu)\zeta^{\mu}=\lambda u^{\mu}, where u\boldsymbol{u} is tangent to a geodesic and lambda\lambda measures the distance from O\mathcal{O}. The components here are those of an orthonormal frame that we set up at the origin O\mathcal{O} (only). If we vary the directions of the tangents we obtain more geodesics that originate from O\mathcal{O}. These are linked by deviation vectors n_(nu)\boldsymbol{n}_{\nu} (one for each component nu\nu of the tangent) with components
For definiteness, we'll treat the geodesics as spacelike.
(a) Show that at any point on the geodesic Gamma^(mu)_(alpha beta)u^(alpha)u^(beta)=0\Gamma^{\mu}{ }_{\alpha \beta} u^{\alpha} u^{\beta}=0. Why does this imply that the connection coefficients vanish at the origin?
(b) Since the connection coefficients vanish at O\mathcal{O} we can expand
Hint: We saw in Exercise 34.2 that taking derivatives of g_(mu nu)=e_(mu)*e_(nu)g_{\mu \nu}=\boldsymbol{e}_{\mu} \cdot \boldsymbol{e}_{\nu} yields the useful identity
Take the derivative of this to compute the expression.
(h) Finally, expand the metric in the Riemann normal coordinates, to recover eqn 35.24 .
(35.5) Consider a two-dimensional surface spanned by basis vectors A\boldsymbol{A} and B\boldsymbol{B}, where [A,B]=0[\boldsymbol{A}, \boldsymbol{B}]=0. Given that the torsion vanishes and that grad_(A)A=-B\nabla_{\boldsymbol{A}} \boldsymbol{A}=-\boldsymbol{B}, grad_(B)B=B\nabla_{B} \boldsymbol{B}=\boldsymbol{B} and grad_(B)A=A\nabla_{B} \boldsymbol{A}=\boldsymbol{A}, show that the Riemann tensor vanishes.
(35.6) Consider the case where we have the basis vectors X\boldsymbol{X} and Y\boldsymbol{Y}, with the properties [X,Y]=0[\boldsymbol{X}, \boldsymbol{Y}]=0 and
(a) Compute the connection coefficients.
(b) Compute the components of the Riemann tensor.
36
36.1 Connection 1-forms 374
36.2 Two rules 377 36.3 Le repère mobile quad379\quad 379 36.4 Example computations 380 Chapter summary
Exercises ^(1){ }^{1} Recall that we put hats on indices (e.g. A_( hat(mu))A_{\hat{\mu}} ) to remind ourselves of the use of the orthonormal frame. ^(2){ }^{2} We can expand this using vielbein by writing g=g_(alpha beta)dx^(alpha)ox dx^(beta)\boldsymbol{g}=g_{\alpha \beta} \boldsymbol{d} x^{\alpha} \otimes \boldsymbol{d} x^{\beta} with components
and then write orthonormal basis 1forms omega^( hat(mu))=(e_(alpha))^( hat(mu))dx^(alpha)\boldsymbol{\omega}^{\hat{\mu}}=\left(\boldsymbol{e}_{\alpha}\right)^{\hat{\mu}} \boldsymbol{d} x^{\alpha}. Putting these expressions together yields eqn 36.2.
Cartan's method
The finest collection of frames I ever saw
Sir Humphery Davy (1778-1829) when asked what he thought
of the Paris art galleries
The calculation of curvature tensors is, generally speaking, a tedious business. However, using the geometrical techniques we have built up in the last few chapters, we can come up with a more efficient means of extracting the tensor R\boldsymbol{R} from the metric field tensor g\boldsymbol{g}. The key, discovered by Elie Cartan, is to make use of forms. Cartan's method is especially effective because we often choose to work in the orthonormal frame ^(1){ }^{1} with metric ^(2){ }^{2}
where eta_( hat(mu) hat(nu))\eta_{\hat{\mu} \hat{\nu}} are the components of the Minkowski metric tensor. That is to say, in a typical coordinate frame (t,chi,theta,phi)(t, \chi, \theta, \phi) with diagonal metric, we cast the metric field in the form
In this chapter, we show how a routine based on taking exterior derivatives of the basis 1 -forms omega^( hat(mu))\boldsymbol{\omega}^{\hat{\mu}} does the job of extracting the curvature.
36.1 Connection 1-forms
In the last chapter, we identified a curvature operator hat(R)(a,b)=\hat{\boldsymbol{R}}(\boldsymbol{a}, \boldsymbol{b})=[grad_(a),grad_(b)]-grad_([a,b])\left[\boldsymbol{\nabla}_{\boldsymbol{a}}, \boldsymbol{\nabla}_{\boldsymbol{b}}\right]-\boldsymbol{\nabla}_{[a, b]}. This is a complicated way of saying that the curvature tensor relies on the behaviour of second covariant derivatives of a vector field. We might wonder if there's another way of framing this second derivative, using the simple connection symbol grad\boldsymbol{\nabla}. The key is to exploit the similarity between grad\boldsymbol{\nabla} and the exterior derivative d\boldsymbol{d} in their action on vectors.
Originally, we didn't know how to use d\boldsymbol{d} on anything other than forms. In Chapter 34, we defined the connection grad\boldsymbol{\nabla} and the covariant derivative grad_(v)=v*grad\boldsymbol{\nabla}_{\boldsymbol{v}}=\boldsymbol{v} \cdot \boldsymbol{\nabla}. Strip the vector field v\boldsymbol{v} from grad\boldsymbol{\nabla} and its action on a scalar function grad f\boldsymbol{\nabla} f is defined to be equivalent to that of the exterior derivative df\boldsymbol{d} f. We then asked how to use d\boldsymbol{d} on vector fields and came up with the answer that, for vectors, d-=grad\boldsymbol{d} \equiv \boldsymbol{\nabla}. We defined the resulting object dv\boldsymbol{d} \boldsymbol{v} to
be a vector-valued 1-form. The exterior derivative d\boldsymbol{d} acts on the vector field v=v^(mu)e_(mu)\boldsymbol{v}=v^{\mu} \boldsymbol{e}_{\mu} one part at a time
Let's now slightly diverge from our path in Chapter 34, which next involved writing connection coefficients. Instead, knowing that we are working on a manifold with a connection, we expand the vector-valued 1 -form de_(mu)d e_{\mu}, which tells us how the basis vectors change in space. We write this in the (slightly complicated-looking) manner
{:(36.5)de_(mu)=omega_(mu)^(nu)oxe_(nu):}\begin{equation*}
d e_{\mu}=\omega_{\mu}^{\nu} \otimes e_{\nu} \tag{36.5}
\end{equation*}
The objects omega^(nu)_(mu)\omega^{\nu}{ }_{\mu} are 1-forms that express the change of the basis vectors in space: they therefore express the connection itself, and we call them connection 1-forms. The right-hand side of eqn 36.5 tells us that, at a particular point in space, the way in which the basis vectors change can be given by an expansion of tensor products of the basis vectors and connection 1-forms omega^(nu)_(mu)\boldsymbol{\omega}^{\nu}{ }_{\mu}.
We can relate the connection 1 -forms to connection coefficients, as they describe the same property of the space. We write
This is exactly what we started with, validating the definitions above.
We now have a 1 -form omega^(mu)_(nu)\boldsymbol{\omega}^{\mu}{ }_{\nu} that expresses the connection. ^(3){ }^{3} Next, we need to determine the curvature.
Roughly speaking we defined curvature in the last chapter in terms of a double derivative of a field v\boldsymbol{v} by using the operator grad\boldsymbol{\nabla}. Here we shall take the double exterior derivative of v\boldsymbol{v} using d\boldsymbol{d}. The first derivative results in a vector-valued 1 -form, and is given by
^(3){ }^{3} In fact, eqn 36.6 provides us with an interpretation of omega^(mu)_(nu)\boldsymbol{\omega}^{\mu}{ }_{\nu}. The object grad_(alpha)e_(mu)\boldsymbol{\nabla}_{\alpha} \boldsymbol{e}_{\mu} tells us the rate of change of the basis vector e_(mu)\boldsymbol{e}_{\mu} along the alpha\alpha direction, and, by the last step in eqn 36.6, that this is equal to omega^(nu)_(mu)(e_(alpha))oxe_(nu)\boldsymbol{\omega}^{\nu}{ }_{\mu}\left(\boldsymbol{e}_{\alpha}\right) \otimes \boldsymbol{e}_{\nu}. We can therefore interpret omega^(nu)_(mu)(e_(alpha))\boldsymbol{\omega}^{\nu}{ }_{\mu}\left(\boldsymbol{e}_{\alpha}\right) as encoding the rate at which e_(mu)^(mu)\boldsymbol{e}_{\mu}^{\mu} rotates towards e_(nu)e_{\nu} as we move along a curve whose tangent vector is given by e_(alpha)\boldsymbol{e}_{\alpha}. ^(4){ }^{4} It's important to note that, because the derivative of a vector results in a (1,1)(1,1) object built from the tensor product of a vector and a 1 -form (and not the antisymmetric wedge product between two 1 -forms), they generally have non-zero exterior derivatives. In short, vectors allow the existence of the peculiar-looking double exterior derivative and it is this that gives us acces to the curvature. ^(5){ }^{5} We interpret this as
Noting that since the component v^(mu)v^{\mu} is itself merely a scalar function and so d^(2)v^(mu)=0\boldsymbol{d}^{2} v^{\mu}=0 (for the usual reason that d^(2)=0\boldsymbol{d}^{2}=0 when acting on any function), we conclude that
Equation 36.12 is the most important one of this chapter. It is built from two terms, the first is the exterior derivative of the connection 1forms, expressing how their rate of change in space; the second is a wedge product of the connection 1-forms, expressing their tube-like structure in space. Both of these aspects are needed to describe curvature. We also write, for economy, a curvature 2 -form operator R()\mathcal{R}(), which is a (2,2)(2,2) object designed for the input of a vector ^(5){ }^{5}
Let's prove this is equivalent to the previously defined curvature tensor by contracting the 2 -form R^(mu)_(nu)=domega^(mu)+omega^(mu)_(alpha)^^omega^(alpha)_(nu)\mathcal{R}^{\mu}{ }_{\nu}=\boldsymbol{d} \boldsymbol{\omega}^{\mu}+\boldsymbol{\omega}^{\mu}{ }_{\alpha} \wedge \boldsymbol{\omega}^{\alpha}{ }_{\nu} with two vectors. First, consider the 1-form
Now contract the second term omega^(mu)_(sigma)^^omega^(sigma)_(nu)\boldsymbol{\omega}^{\mu}{ }_{\sigma} \wedge \boldsymbol{\omega}^{\sigma}{ }_{\nu} in the same way
From the last example we see that we can write the curvature 2 -form R^(mu)_(nu)\mathcal{R}^{\mu}{ }_{\nu} in terms of the components of the Riemann tensor R\boldsymbol{R} :
where the restriction on the sum removes the factor of 1//21 / 2 that we had above. ^(6){ }^{6} We now have an expression for the curvature 2 -form, but lack a simple method for finding the connection 1-forms omega^(mu)_(alpha)\boldsymbol{\omega}^{\mu}{ }_{\alpha} from the metric field g\boldsymbol{g}. We now turn to this matter, since it's our reason for pursuing this formalism.
36.2 Two rules
The previous section resulted in a number of equivalent expressions designed to compute the curvature of a spacetime. They rely on using the connection 1 -forms of the spacetime. So how do we find the connection 1 -forms omega^(mu)_(nu)\boldsymbol{\omega}^{\mu}{ }_{\nu} ? The good news is that, given the metric, they can usually be found by guesswork! More specifically, we express our (0,2)(0,2) metric field g\boldsymbol{g} in terms of basis 1-forms g=g_(mu nu)omega^(mu)oxomega^(nu)\boldsymbol{g}=g_{\mu \nu} \boldsymbol{\omega}^{\mu} \otimes \boldsymbol{\omega}^{\nu} and use the exterior derivatives of the basis 1 -forms to guess the connection 1 -forms.
What enables this strategy are two very restrictive rules that we can derive by simply considering the action of a Leibniz product rule on a dot product
We're familiar with the important rule (:omega^(mu),e_(nu):)=delta^(mu)_(nu)\left\langle\boldsymbol{\omega}^{\mu}, \boldsymbol{e}_{\nu}\right\rangle=\delta^{\mu}{ }_{\nu}. The conceptual trick here, due to Cartan, is to take a field of vectors and a field of 1 -forms and consider the contraction of a basis vector e_(mu)\boldsymbol{e}_{\mu} and a basis 1 -form omega^(mu)\boldsymbol{\omega}^{\mu}, at a specific point P\mathcal{P}. In this spirit, we write that the derivative of a point dP\boldsymbol{d} \mathcal{P} corresponds to
What does this mean and how does this work? Cartan's view was that a vector could be seen as the movement of a point and so the most primitive version of a vector would be the movement of a point dP\boldsymbol{d} \mathcal{P}. ^(6){ }^{6} The |alpha beta||\alpha \beta| notation here should be interpreted as only allowing components in the order alpha beta\alpha \beta and not beta alpha\beta \alpha. In the interests of writing memorable equations, we can also write things in terms of the R(\mathcal{R}( ) operator, but defining the (2,0)(2,0)R^(mu nu)\mathcal{R}^{\mu \nu}, such that R()=(1)/(2)(e_(mu)^^e_(nu))R^(mu nu)\mathcal{R}()=\frac{1}{2}\left(\boldsymbol{e}_{\mu} \wedge \boldsymbol{e}_{\nu}\right) \mathcal{R}^{\mu \nu}. R\mathcal{R}, such that R()=(1)/(2)(e_(mu)^^e_(nu))RR^(mu nu)\mathcal{R}()=\frac{1}{2}\left(\boldsymbol{e}_{\mu} \wedge \boldsymbol{e}_{\nu}\right) \mathcal{R} \mathcal{R}^{\mu \nu}.
Then we can write the curvature operThen we can write the curvature oper-
ator in terms of the components of the ator in terms of the components of the
Riemann tensor. The curvature operator R\mathcal{R} is given by R()=(1)/(4)(e_(mu)^^e_(nu))R^(mu nu)_(alpha beta)(omega^(alpha)^^omega^(beta))\mathcal{R}()=\frac{1}{4}\left(\boldsymbol{e}_{\mu} \wedge \boldsymbol{e}_{\nu}\right) R^{\mu \nu}{ }_{\alpha \beta}\left(\boldsymbol{\omega}^{\alpha} \wedge \boldsymbol{\omega}^{\beta}\right). ^(7){ }^{7} That is to say
This is not yet quite a vector. The object dP(\boldsymbol{d} \mathcal{P}(,)shouldbeinterpreted) should be interpreted as a (1,1)(1,1) quantity. This means that it's a vector that hasn't yet been given a direction, and so is a sort of proto-vector that doesn't point anywhere. In fact, to point it along a direction we insert a vector vv into one of its slots, to find (:dP,v:)=v\langle\boldsymbol{d} \mathcal{P}, \boldsymbol{v}\rangle=\boldsymbol{v}. For this to work, we must therefore have that ^(7)dP=e_(mu)oxomega^(mu){ }^{7} \boldsymbol{d} \mathcal{P}=\boldsymbol{e}_{\mu} \otimes \boldsymbol{\omega}^{\mu}.
Next, we apply the product rule to the exterior derivative of dP=\boldsymbol{d} \mathcal{P}=e_(mu)oxomega^(mu)\boldsymbol{e}_{\mu} \otimes \boldsymbol{\omega}^{\mu}. Since P\mathcal{P} acts like a function, we must have d^(2)P=0\boldsymbol{d}^{2} \mathcal{P}=0, and so
^(8){ }^{8} This is also sometimes known as Cartan's first structural equation. Note that combining it with omega^(nu)_(mu)=Gamma^(nu)_(lambda mu)omega^(lambda)\boldsymbol{\omega}^{\nu}{ }_{\mu}=\Gamma^{\nu}{ }_{\lambda \mu} \boldsymbol{\omega}^{\lambda} we obtain
There is second simplification we can make to restrict the connection 1 -forms. This one is based on the definition g_(mu nu)=e_(mu)*e_(nu)g_{\mu \nu}=\boldsymbol{e}_{\mu} \cdot \boldsymbol{e}_{\nu}. Insert u=e_(mu)\boldsymbol{u}=\boldsymbol{e}_{\mu} and v=e_(nu)\boldsymbol{v}=\boldsymbol{e}_{\nu} into the product rule (eqn 36.22) to find
Again using the fact that a dot is an instruction to combine vectors according to the rule e_(mu)*e_(nu)=g_(mu nu)\boldsymbol{e}_{\mu} \cdot \boldsymbol{e}_{\nu}=g_{\mu \nu}, we obtain
The compatibility condition is actually equivalent to the condition grad g=\boldsymbol{\nabla} \boldsymbol{g}= 0 seen in Chapter 34.
With these two key equations available to help us turn the metric into the connection 1 -forms, we make a final simplifying move. We agree to work in an orthonormal basis, which means g_(mu nu)=eta_( hat(mu) hat(nu))g_{\mu \nu}=\eta_{\hat{\mu} \hat{\nu}} and deta_( hat(mu) hat(nu))=0\boldsymbol{d} \eta_{\hat{\mu} \hat{\nu}}=0, so, by the compatibility condition, we have connections omega_( hat(mu) hat(nu))=-omega_( hat(nu) hat(mu))\boldsymbol{\omega}_{\hat{\mu} \hat{\nu}}=-\boldsymbol{\omega}_{\hat{\nu} \hat{\mu}}.
Having covered a lot of technical ground, let's collect together the key ideas, in the order they will be used:
Idea 1: Cartan's two rules for finding curvature 1-forms
As advertised, we shall find in the examples below that Idea 1 and some (rather uninspired) guesswork solves the problem of finding the connection 1-forms omega^( hat(mu))_( hat(nu))\boldsymbol{\omega}^{\hat{\mu}}{ }_{\hat{\nu}}. We then immediately have access to the curvature 2 -form and can extract the components of the Riemann tensor. Although, at this stage, this no doubt looks rather involved, in practice, this makes working out Riemann tensor components a fairly simple job.
36.3 Le repère mobile
Cartan intended his method to be used in an orthonormal frame of reference. Although we have been referring regularly to the orthonormal frame, we should perhaps have spoken of orthonormal frames. The idea, after all, is that, at a particular point in spacetime, we erect a local frame of reference and normalize its orthogonal axes. This gives us an orthonormal frame at each point in spacetime, which we might helpfully think of as a field of frames. The relationship between the orthonormal frames found at each point is encoded in the connection, expressed in Cartan's method as the 1-form field omega^(mu)_(nu)(x)\boldsymbol{\omega}^{\mu}{ }_{\nu}(x). Cartan himself imagined moving through space watching the orthonormal frame picked out at each point seemingly change as a function of position. He called this changing frame of reference Le repère mobile ( ~~\approx the moving frame). ^(9){ }^{9} Whichever way we think of it, the field of orthonormal frames is the starting point for using Cartan's method. ^(10){ }^{10} Once we have the orthonormal frame(s), we can find the connection 1 -forms.
In what follows, we shall be working in local orthonormal frames for our computations. These are frames in which the Minkowski metric raises and lowers indices, but where the connection coefficients do not vanish. As we've stressed previously, there is a simple prescription for obtaining the orthonormal frame: we choose the one appropriate for an observer at rest in the coordinate frame at a particular point.
In general, the metric can be written as
^(9){ }^{9} It is somehow appropriate that the French reflexive verb se repérer means to find one's way around, something we are needing to do all the time in curved spaces! ^(10){ }^{10} In reviewing this procedure here, we are just upgrading the procedure of normalizing the components of a diagonal malizing the comp of diagonal metric, discussed in Chapter 10.
Fig. 36.1 The stages in Cartan's recipe for computing the Riemann tensor.
Should you want them, you can also extract connection coefficients using omega^( hat(nu))_( hat(mu))=Gamma^( hat(nu))_( hat(ı))omega^(lambda)\boldsymbol{\omega}^{\hat{\nu}}{ }_{\hat{\mu}}=\Gamma^{\hat{\nu}}{ }_{\hat{\imath}} \boldsymbol{\omega}^{\lambda}. However, you should remember that, not only are these not remember that, not only are these no required to access the curvature, bu that they cannot simply be transformed between frames as tensors and that, importantly here, in non-coordinate frames we lose the symmetry in the lower indices.
From here we write omega^(mu)oxomega^(mu)=(omega^(mu))^(2)\boldsymbol{\omega}^{\mu} \otimes \boldsymbol{\omega}^{\mu}=\left(\boldsymbol{\omega}^{\mu}\right)^{2}, so we have, for the usual case of a diagonal metric in (3+1)-dimensional spacetime,
With this, we are finally ready to try some example computations.
36.4 Example computations
Here's the recipe for how to calculate the Riemann tensor components using these ideas (see also Fig. 36.1).
Step I: Identify the orthonormal basis from the metric, by comparing the given metric to
In this basis, we have dg_(mu nu)=deta_( hat(mu) hat(nu))=0\boldsymbol{d} g_{\mu \nu}=\boldsymbol{d} \eta_{\hat{\mu} \hat{\nu}}=0, so, by the compatibility condition omega_( hat(mu) hat(nu))=-omega_( hat(nu) hat(mu))\boldsymbol{\omega}_{\hat{\mu} \hat{\nu}}=-\boldsymbol{\omega}_{\hat{\nu} \hat{\mu}}.
Step II: Take the exterior derivatives of the basis 1 -forms. We have Idea 1 (the symmetry condition domega^( hat(mu))=-omega^( hat(mu))_( hat(nu))^^omega^( hat(nu))\boldsymbol{d} \boldsymbol{\omega}^{\hat{\mu}}=-\boldsymbol{\omega}^{\hat{\mu}}{ }_{\hat{\nu}} \wedge \boldsymbol{\omega}^{\hat{\nu}} ), and so it is an easy job to simply guess the values of omega^( hat(mu))_( hat(nu))\boldsymbol{\omega}^{\hat{\mu}}{ }_{\hat{\nu}} from the results of the derivatives.
Step III: Take the exterior derivatives of the connection 1-forms omega^( hat(mu))_( hat(nu))\boldsymbol{\omega}^{\hat{\mu}}{ }_{\hat{\nu}}. Step IV: Assemble the curvature 2-form R^( hat(mu))_( hat(nu))\mathcal{R}^{\hat{\mu}}{ }_{\hat{\nu}} using Idea 2.
Step V: Extract components R_( hat(nu) hat(alpha) hat(beta))^( hat(mu))R_{\hat{\nu} \hat{\alpha} \hat{\beta}}^{\hat{\mu}} as needed using Idea 3. Remember that you're currently working in the orthonormal frame, so it might be necessary to transform out of it using the vielbein components.
Let's discuss some examples. First we warm up with the (fairly trivial) case of flat two-dimensional space. Of course we expect the curvature
tensor to vanish if the space is truly flat. We won't include time in these first few computations, meaning that indices are raised and lowered in the orthonormal frame using the metric components eta^(mu nu)=diag(1,1)\eta^{\mu \nu}=\operatorname{diag}(1,1), so that up and down become equivalent since this is Euclidean space.
Example 36.3
Flat space in polar coordinates has a metric line element ds^(2)=dr^(2)+r^(2)dtheta^(2)\boldsymbol{d} s^{2}=\boldsymbol{d} r^{2}+r^{2} \boldsymbol{d} \theta^{2}. We can identify an orthonormal basis with basis 1-forms
As a result R^( hat(mu))_( hat(nu))=0\mathcal{R}^{\hat{\mu}}{ }_{\hat{\nu}}=0 for all components. This means that the Riemann tensor vanishes confirming (as we certainly knew) that this space is flat.
Next, we look at the space on the surface of the 2-sphere. Here we expect to see the effect of curvature.
Example 36.4
Spherical space has a metric line element ds^(2)=a^(2)dtheta^(2)+a^(2)sin^(2)theta dphi^(2)\boldsymbol{d} \boldsymbol{s}^{2}=a^{2} \boldsymbol{d} \theta^{2}+a^{2} \sin ^{2} \theta \boldsymbol{d} \phi^{2}, with aa constant. We identify orthonormal basis 1 -forms
{:(36.51)omega^( hat(theta))=ad theta","quadomega^( hat(phi))=a sin theta d phi:}\begin{equation*}
\boldsymbol{\omega}^{\hat{\theta}}=a \boldsymbol{d} \theta, \quad \boldsymbol{\omega}^{\hat{\phi}}=a \sin \theta \boldsymbol{d} \phi \tag{36.51}
\end{equation*}
We find exterior derivatives
{:(36.52)domega^( hat(theta))=0:}\begin{equation*}
d \omega^{\hat{\theta}}=0 \tag{36.52}
\end{equation*}
and
{:(36.53)domega^( hat(phi))=a cos d theta^^d phi=-(cos theta)/(a sin theta)omega^( hat(phi))^^omega^( hat(theta)):}\begin{equation*}
\boldsymbol{d} \boldsymbol{\omega}^{\hat{\phi}}=a \cos \boldsymbol{d} \theta \wedge \boldsymbol{d} \phi=-\frac{\cos \theta}{a \sin \theta} \boldsymbol{\omega}^{\hat{\phi}} \wedge \boldsymbol{\omega}^{\hat{\theta}} \tag{36.53}
\end{equation*}
We read off that a non-zero connection 1-form is given by
{:(36.54)omega_( hat(theta))^( hat(phi))=(cos theta)/(a sin theta)omega^( hat(phi))=cos theta d phi:}\begin{equation*}
\boldsymbol{\omega}_{\hat{\theta}}^{\hat{\phi}}=\frac{\cos \theta}{a \sin \theta} \boldsymbol{\omega}^{\hat{\phi}}=\cos \theta \boldsymbol{d} \phi \tag{36.54}
\end{equation*}
and, using omega_(i hat(j))=-omega_( hat(j) hat(i))\boldsymbol{\omega}_{i \hat{j}}=-\boldsymbol{\omega}_{\hat{j} \hat{i}}, get for free that ^(12){ }^{12}
^(11){ }^{11} Warning: To avoid making a mistake in taking this derivative, it's best to consider the version written in terms of the 1 -form dx^(mu)\boldsymbol{d} x^{\mu} rather than omega^(mu)\boldsymbol{\omega}^{\mu}. Note also that since omega_( hat(i) hat(j))=-omega_( hat(j) hat(i))\boldsymbol{\omega}_{\hat{i} \hat{j}}=-\boldsymbol{\omega}_{\hat{j} \hat{i}} (see Exercises) we have omega_( hat(r))^( hat(theta))=-omega_( hat(theta))^( hat(theta))=omega^(theta)//r\boldsymbol{\omega}_{\hat{r}}^{\hat{\theta}}=-\boldsymbol{\omega}_{\hat{\theta}}^{\hat{\theta}}=\boldsymbol{\omega}^{\theta} / r, so that omega_( hat(r))^( hat(theta))^^omega_( hat(theta))^( hat(theta))=0\boldsymbol{\omega}_{\hat{r}}^{\hat{\theta}} \wedge \boldsymbol{\omega}_{\hat{\theta}}^{\hat{\theta}}=0.
^(3){ }^{3} Here the symbol |alpha beta||\alpha \beta| orders components in the order they appear in the wedge product. That is, | hat(alpha) hat(beta)|= hat(theta) hat(phi)|\hat{\alpha} \hat{\beta}|=\hat{\theta} \hat{\phi}.
^(14){ }^{14} The easiest way to proceed here is We also find ^(14){ }^{14} to manipulate indices using the symmetries of R\boldsymbol{R}. In this case,
which, in the cordinate frame, R_( hat(theta) hat(phi) hat(theta))^(theta^(˙))=(1)/(a^(2))R_{\hat{\theta} \hat{\phi} \hat{\theta}}^{\dot{\theta}}=\frac{1}{a^{2}},
which, in the coordinate frame, becomes
Note that, being a scalar, RR is the same if we choose to evaluate it in the coordinate frame using the Ricci tensor components R_(mu nu)R_{\mu \nu}.
What can be done on the surface of a sphere can be done on other surfaces and there are several examples in the exercises at the end of the chapter. Let's now include time and find the curvature of some nontrivial spacetimes. First is an expanding Universe model from Chapter 18 where space is flat, but spacetime is not.
Example 36.5
The de Sitter model of the Universe ^(15){ }^{15} has a metric with line element
First we note
and we use domega^( hat(t))=-omega^( hat(t))_( hat(k))^^omega^( hat(k))\boldsymbol{d} \boldsymbol{\omega}^{\hat{t}}=-\boldsymbol{\omega}^{\hat{t}}{ }_{\hat{k}} \wedge \boldsymbol{\omega}^{\hat{k}}, from which we guess that omega_( hat(k))^( hat(t))propomega^( hat(k))\boldsymbol{\omega}_{\hat{k}}^{\hat{t}} \propto \boldsymbol{\omega}^{\hat{k}}. Then
we guess omega_( hat(i))^( hat(k))=((a^(˙)))/(a)omega^( hat(k))\boldsymbol{\omega}_{\hat{i}}^{\hat{k}}=\frac{\dot{a}}{a} \omega^{\hat{k}} and omega_( hat(l))^( hat(k))=0\boldsymbol{\omega}_{\hat{l}}^{\hat{k}}=0. Now to compute the curvature 2-form
To find the hat(i) hat(i)\hat{i} \hat{i} th components we have R_( hat(i) hat(i))=R^( hat(t)) hat(i( hat(i))( hat(i)))+sum_(j=1)^(3)R^( hat(i) hat(j) hat(i)) hat(i)^( hat(j))R_{\hat{i} \hat{i}}=R^{\hat{t}} \hat{i \hat{i} \hat{i}}+\sum_{j=1}^{3} R^{\hat{i} \hat{j} \hat{i}} \hat{i}^{\hat{j}}. However, the component with hat(i)= hat(j)\hat{i}=\hat{j} is R_( hat(i) hat(i) hat(i))=0R_{\hat{i} \hat{i} \hat{i}}=0 because of the symmetries of the Riemann tensor. As a result we have
In this case, it's meaningful to extract the components of the Einstein tensor G_(mu nu)=G_{\mu \nu}=R_(mu nu)-(1)/(2)g_(mu nu)RR_{\mu \nu}-\frac{1}{2} g_{\mu \nu} R. These are
^(17)We{ }^{17} \mathrm{We} can therefore say R^(k)_(i hat(k) hat(t))=R^{k}{ }_{i \hat{k} \hat{t}}=R_( hat(k) hat(t) hat(k) hat(t))=R_( hat(t) hat(k) hat(k) hat(k))=-R^( hat(t) hat(k) hat(k) hat(k))R_{\hat{k} \hat{t} \hat{k} \hat{t}}=R_{\hat{t} \hat{k} \hat{k} \hat{k}}=-R^{\hat{t} \hat{k} \hat{k} \hat{k}}. ^(18){ }^{18} We write the expressions in this form for simplicity, where the index is raised for simplicity, where the index is raised by the appropriate components of the
Minkowski tensor, since we're working Minkowski tensor, since w
in the orthonormal frame.
in the orthonormal frame. ^(9){ }^{9} Apply the rule
where we are forced to rearrange the components in order to obey the instruction | hat(alpha) hat(beta)||\hat{\alpha} \hat{\beta}| ^(20){ }^{20} The simplest way to do this is to use the rule from Chapter 21 that
There are several more examples to try in the exercises.
We now have an efficient method to extract curvature of a spacetime from its metric. In the final chapters of this part of the book, we discuss some different aspects of geometric techniques.
Chapter summary
To compute the Riemann tensor, work in the orthonormal frame and compute the curvature 1-forms, which have the properties
{:(36.91)domega^(mu)+omega^(mu)_(nu)^^omega^(nu)=0",":}\begin{equation*}
d \omega^{\mu}+\omega^{\mu}{ }_{\nu} \wedge \omega^{\nu}=0, \tag{36.91}
\end{equation*}
and omega_( hat(mu) hat(nu))=-omega_( hat(nu) hat(mu))\boldsymbol{\omega}_{\hat{\mu} \hat{\nu}}=-\boldsymbol{\omega}_{\hat{\nu} \hat{\mu}}.
(36.1) Show that, when working in the orthonormal (36.8) Consider a time-evolving star with a spherically frame,
(a) omega_(i)^(t)=omega_(i)^(i)\boldsymbol{\omega}_{i}^{t}=\boldsymbol{\omega}_{i}^{i}.
(b) omega_( hat(j))=-omega_( hat(i))^( hat(j))\boldsymbol{\omega}_{\hat{j}}=-\omega_{\hat{i}}^{\hat{j}}.
(36.2) Read off the connection coefficients in an orthonormal frame for flat space described using cylindrical polar coordinates.
(36.3) A surface has a metric with line element
Compute (i) the components of R\boldsymbol{R}, (ii) the components of the Ricci tensor, and (iii) the Ricci scalar.
(36.5) The torus metric has a line element
Compute: (a) the components of R\boldsymbol{R}, (b) the components of the Ricci tensor, and (c) the Ricci scalar.
(36.6) The Poincaré half-plane has line element
Compute (a) the components of R\boldsymbol{R}, (b) the components of the Ricci tensor, and (c) the Ricci scalar.
(36.7) Consider the expanding universe of the Friedman model, with metric line element ds^(2)=-dt^(2)+a(t)^(2)[dchi^(2)+sin^(2)chi(dtheta^(2)+sin^(2)theta dphi^(2))]\boldsymbol{d} s^{2}=-\boldsymbol{d} t^{2}+a(t)^{2}\left[\boldsymbol{d} \chi^{2}+\sin ^{2} \chi\left(\boldsymbol{d} \theta^{2}+\sin ^{2} \theta \boldsymbol{d} \phi^{2}\right)\right].
(36.98)
Compute (a) R_( hat(nu))^( tilde(nu))\mathcal{R}_{\hat{\nu}}^{\tilde{\nu}}, (b) the Ricci scalar and, (c) the components of the Einstein tensor G\boldsymbol{G} in the orthonormal frame.
symmetric, time-dependent metric line element ds^(2)=-e^(2Phi)dT^(2)+e^(2Lambda)dR^(2)+r^(2)(dtheta^(2)+sin^(2)theta dphi^(2))\boldsymbol{d} \boldsymbol{s}^{2}=-\mathrm{e}^{2 \Phi} \boldsymbol{d} T^{2}+\mathrm{e}^{2 \Lambda} \boldsymbol{d} R^{2}+r^{2}\left(\boldsymbol{d} \theta^{2}+\sin ^{2} \theta \boldsymbol{d} \phi^{2}\right), (36.99)
where Phi(R,T),Lambda(R,T)\Phi(R, T), \Lambda(R, T) and r(R,T)r(R, T) are all functions of the coordinates RR and TT.
(a) Show that
where E=e^(-2Phi)((Lambda^(¨))+Lambda^(˙)^(2)-(Lambda^(˙))(Phi^(˙)))-e^(-2Lambda)(Phi^('')+Phi^('2)-Phi^(')Lambda^('))E=\mathrm{e}^{-2 \Phi}\left(\ddot{\Lambda}+\dot{\Lambda}^{2}-\dot{\Lambda} \dot{\Phi}\right)-\mathrm{e}^{-2 \Lambda}\left(\Phi^{\prime \prime}+\Phi^{\prime 2}-\Phi^{\prime} \Lambda^{\prime}\right), bar(E)=(1)/(r)e^(-2Phi)(r^(¨)-r^(˙)Phi^(˙))-(1)/(r)e^(-2Lambda)r^(')Phi^(')\bar{E}=\frac{1}{r} \mathrm{e}^{-2 \Phi}(\ddot{r}-\dot{r} \dot{\Phi})-\frac{1}{r} \mathrm{e}^{-2 \Lambda} r^{\prime} \Phi^{\prime}, H=(1)/(r)e^(-Phi)e^(-Lambda)(r^(˙)^(')-(r^(˙))Phi^(')-r^(')(Lambda^(˙)))H=\frac{1}{r} \mathrm{e}^{-\Phi} \mathrm{e}^{-\Lambda}\left(\dot{r}^{\prime}-\dot{r} \Phi^{\prime}-r^{\prime} \dot{\Lambda}\right), F=(1)/(r^(2))(1-(r^('))^(2)e^(-2Lambda)+r^(˙)^(2)e^(-2Phi))F=\frac{1}{r^{2}}\left(1-\left(r^{\prime}\right)^{2} \mathrm{e}^{-2 \Lambda}+\dot{r}^{2} \mathrm{e}^{-2 \Phi}\right), bar(F)=(1)/(r)e^(-2Phi)r^(˙)Lambda^(˙)+(1)/(r)e^(-2Lambda)(r^(')Lambda^(')-r^(''))\bar{F}=\frac{1}{r} \mathrm{e}^{-2 \Phi} \dot{r} \dot{\Lambda}+\frac{1}{r} \mathrm{e}^{-2 \Lambda}\left(r^{\prime} \Lambda^{\prime}-r^{\prime \prime}\right).
See Misner, Thorne, and Wheeler Chapters 14, 26, and 32 for further details, including the use of this metric in studying stellar pulsations and gravitational collapse.
37
37.1 Motivation: 2-forms and flux
37.2 Hodge star operation 387 37.3 Volume forms 392 Chapter summary 395
Exercises 395 ↷\curvearrowright The material in this chapter is useful in understanding the geometrical interpretation of electromagnetism (Chapter 42) and the Bianchi identity (Chapter 43). ^(1){ }^{1} In Chapter 32 we saw that an antisymmetric tensor, formed from NN vec tors wedged together is known as a N vector. It is an antisymmetric ( N,0N, 0 ) tensor. ^(2){ }^{2} Sir William V. D. Hodge (19031975). The influential mathematician 1975). The influential mathematician
Sir Michael Atiyah was one of his docSir Michael Atiyah was one of his doc-
toral students. Hodge's three-volume toral students. Hodge's three-volume
work Methods of Algebraic Geometry, cowritten with Daniel Pedoe (19101998), freely used the component notation referred to by Cartan in the quotation above.
Duality and the volume form
tensors that encode the same information. By this we mean that we start with one sort of tensor and uniquely obtain another that expresses related physical content. In this section, we shall see how to map between a pp-form, which is an (0,p)(0, p) antisymmetric tensor that can built from 1 -forms using the wedge product, and an antisymmetric qq-vector (q,0)(q, 0), built from vectors, again combined using the wedge product. ^(1){ }^{1} This mapping is known as a duality, and can be carried out using an operation known as the Hodge star. ^(2){ }^{2} Our rather formal discussion in this chapter provides a general method of mapping between two different sorts of tensor. The payoff from this formalism will be a new insight into how volumes can be encoded in tensor algebra.
37.1 Motivation: 2-forms and flux
Let's consider a 2 -form in three-dimensional Euclidean space with components
{:(37.1) tilde(F)(",")=F^(1)(dy^^dz)+F^(2)(dz^^dx)+F^(3)(dx^^dy):}\begin{equation*}
\tilde{\boldsymbol{F}}(,)=F^{1}(\boldsymbol{d} y \wedge \boldsymbol{d} z)+F^{2}(\boldsymbol{d} z \wedge \boldsymbol{d} x)+F^{3}(\boldsymbol{d} x \wedge \boldsymbol{d} y) \tag{37.1}
\end{equation*}
By virtue of working in space with n=3n=3 dimensions, this object has three components. Now let's extract the components and use them to form a vector u\boldsymbol{u}, which we shall write
What is the relationship between tilde(F)\tilde{\boldsymbol{F}} and u\boldsymbol{u} ? We shall see shortly that these objects are dual to each other and can be related via the Hodge star operation, written as *** tilde(F)=u\star \tilde{\boldsymbol{F}}=\boldsymbol{u}. Physically, we can see how they are related by using the notion of flux.
We shall consider a fluid flowing throughout a volume with velocity u\boldsymbol{u}. The flux of the fluid is the amount of fluid flowing through an area AA in unit time. If the area is a parallelogram with sides given by vectors a\boldsymbol{a} and b\boldsymbol{b}, then shall show in Example 37.1 that
{:(37.3)(" Flux of "u" through "A)= tilde(F)(a","b)". ":}\begin{equation*}
(\text { Flux of } \boldsymbol{u} \text { through } A)=\tilde{\boldsymbol{F}}(\boldsymbol{a}, \boldsymbol{b}) \text {. } \tag{37.3}
\end{equation*}
From our discussion in Example 32.6 of Chapter 32, it follows that we can determine the flux by computing tilde(F)(a,b)=(: tilde(F),a^^b:)\tilde{\boldsymbol{F}}(\boldsymbol{a}, \boldsymbol{b})=\langle\tilde{\boldsymbol{F}}, \boldsymbol{a} \wedge \boldsymbol{b}\rangle, that is, the inner product of the 2 -form F\boldsymbol{F} and the bivector a^^b\boldsymbol{a} \wedge \boldsymbol{b}. However, in the next example, we show that the same result can be found using two related objects: the vector u\boldsymbol{u} constructed from F\boldsymbol{F} and a 1-form S\boldsymbol{S} constructed from the bivector a^^b\boldsymbol{a} \wedge \boldsymbol{b}.
Example 37.1
Suppose the only non-zero component of u\boldsymbol{u} is F^(3)F^{3}. Then, in three dimensions, the amount of fluid per unit time passing through AA will be given by the projection of u\boldsymbol{u} onto the area: F^(3)e_(z)*(a xx b)F^{3} \boldsymbol{e}_{\boldsymbol{z}} \cdot(\boldsymbol{a} \times \boldsymbol{b}), where we have used the conventional cross product for three-dimensional space to express the area spanned by the a\boldsymbol{a} and b\boldsymbol{b}. Expanding components we find this is equal to F^(3)(a^(x)b^(y)-b^(x)a^(y))=F^(3)dx(a)^^dy(b)F^{3}\left(a^{x} b^{y}-b^{x} a^{y}\right)=F^{3} \boldsymbol{d} x(\boldsymbol{a}) \wedge \boldsymbol{d} y(\boldsymbol{b}). We now let F^(1)F^{1} and F^(2)F^{2} take non-zero values and repeat the argument for these components to find that the total flux is given by F^(1)dy(a)^^dz(b)+F^(2)dz(a)^^dx(b)+F^(3)dx(a)^^dy(b)= tilde(F)(a,b)F^{1} \boldsymbol{d} y(\boldsymbol{a}) \wedge \boldsymbol{d} z(\boldsymbol{b})+F^{2} \boldsymbol{d} z(\boldsymbol{a}) \wedge \boldsymbol{d} x(\boldsymbol{b})+F^{3} \boldsymbol{d} x(\boldsymbol{a}) \wedge \boldsymbol{d} y(\boldsymbol{b})=\tilde{\boldsymbol{F}}(\boldsymbol{a}, \boldsymbol{b}).
This proves eqn 37.3.
It's useful to note that the bivector a^^b\boldsymbol{a} \wedge \boldsymbol{b} has non-zero components (a^^b)^(yx)=(\boldsymbol{a} \wedge \boldsymbol{b})^{y x}=a^(y)b^(z)-b^(y)a^(z),(a^^b)^(zx)=a^(z)b^(x)-b^(z)a^(x)a^{y} b^{z}-b^{y} a^{z},(\boldsymbol{a} \wedge \boldsymbol{b})^{z x}=a^{z} b^{x}-b^{z} a^{x} and (a^^b)^(xy)=a^(x)b^(y)-b^(x)a^(y)(\boldsymbol{a} \wedge \boldsymbol{b})^{x y}=a^{x} b^{y}-b^{x} a^{y}. We notice that if these were to be arranged to make the components of a 1 -form, we would have
The flux can then also be expressed as (: tilde(S),u:)\langle\tilde{\boldsymbol{S}}, \boldsymbol{u}\rangle.
This example demonstrates that, in addition to the 2 -form tilde(F)\tilde{\boldsymbol{F}}, there is a vector u\boldsymbol{u} that encodes the same information via the physical notion of a flux. The vector u\boldsymbol{u} is the dual of tilde(F)\tilde{\boldsymbol{F}}. Similarly, the bivector a^^b\boldsymbol{a} \wedge \boldsymbol{b} has a dual 1 -form tilde(S)\tilde{\boldsymbol{S}}. We can obtain the dual of an object using the Hodge star operation; the object outputted by this operation depends on the dimensionality of the space in which we're working. ^(3){ }^{3} We often work in n=3n=3-dimensional space, in which the Hodge star takes a qq-form and outputs a (3-q)(3-q)-vector. As a result, the Hodge star of a 2 -form is a vector. Similarly, in three dimensions the Hodge star converts a pp-vector into a (3-p)(3-p)-form, so that ^(4){ }^{4} the bivector is dual to a 1 -form.
In the next section, we describe the procedure required to compute a dual and its components. Although fairly straightforward, this is a slightly tedious business when n!=3n \neq 3, and so this section can be skipped on a first reading. (Some useful rules are collected in the margin at the end of the next section.)
37.2 Hodge star operation
The duality transformation, also known as the Hodge star operation, is carried out using a special tensor called the volume form. This ^(3){ }^{3} In general, the Hodge star takes a q-q- form and outputs an (n-q)(n-q)-vector. It also takes a pp-vector and outputs an (n-p)(n-p) form. ^(4){ }^{4} This property of three-dimensional Euclidean space is what allows our conEuclidean space is what allows our con-
ventional vector calculus to work as ventional vector calculus to work as
neatly as it does. In particular, it exneatly as it does. In particular, it ex-
plains the existence of the cross product plains the existence of the cross product a xx b\boldsymbol{a} \times \boldsymbol{b} which is the vector that is dual to a bivector a^^b\boldsymbol{a} \wedge \boldsymbol{b}. ^(5){ }^{5} Recall that the orthonormal basis is an example of a local frame in which measurements can be made. For (3+1)dimensional spacetime it has metric tensor
which has signature (-+++)(-+++). ^(6){ }^{6} It's a candidate at this stage as we don't yet have a means of determining the ordering of the basis 1 -forms. This gives rise to a sign ambiguity that we resolve in the next Example.
is another example of a pp-form, this one built from the basis forms of the space in which we're working. Working in the four-dimensional orthonormal basis ^(5){ }^{5} with basis 1 -forms omega^( hat(0)),dots,omega^( hat(3))\boldsymbol{\omega}^{\hat{0}}, \ldots, \boldsymbol{\omega}^{\hat{3}}. We then construct a candidate volume 4 -form tilde(omega)\tilde{\boldsymbol{\omega}} via ^(6)^{6}
In the orthonormal basis, the volume form has components omega_( hat(mu) hat(nu) hat(alpha) hat(beta))=\omega_{\hat{\mu} \hat{\nu} \hat{\alpha} \hat{\beta}}=epsi_(mu nu alpha beta)\varepsilon_{\mu \nu \alpha \beta}, where epsi_(mu nu alpha beta)\varepsilon_{\mu \nu \alpha \beta} is the four-dimensional Levi-Civita symbol.
Example 37.2
We can treat the Levi-Civita symbol epsi_(mu nu alpha beta)\varepsilon_{\mu \nu \alpha \beta} as being the components of a LeviCivita tensor epsi(,,\varepsilon(,,,).Thetensorisantisymmetricinallofitsslots,soitchanges) . The tensor is antisymmetric in all of its slots, so it changes sign when any two vectors exchange their slots. In Minkowski spacetime, we fix the sign by saying ^(7){ }^{7} We also have the rule, if we have MM indices, that
{:(37.9)epsi_(mu nu alpha beta)={[0" unless "mu","nu","alpha","beta" are all different "],[+1" for even permutations of "0","1","2","3","],[-1" for odd permutation. "]:}:}\varepsilon_{\mu \nu \alpha \beta}=\left\{\begin{array}{l}
0 \text { unless } \mu, \nu, \alpha, \beta \text { are all different } \tag{37.9}\\
+1 \text { for even permutations of } 0,1,2,3, \\
-1 \text { for odd permutation. }
\end{array}\right.
It's important to note that, if we're treating epsi_(mu nu alpha beta)\varepsilon_{\mu \nu \alpha \beta} as the components of a tensor, as we do here, then in Minkowski spacetime the all-up version has the property that
If, instead of an orthonormal basis, we consider a general metric space, we can use an arbitrary coordinate system with basis 1-forms dx^(0),dots,dx^(n)\boldsymbol{d} x^{0}, \ldots, \boldsymbol{d} x^{n}. In that case, we must include a factor of the determinant of the metric in our definition of the volume form thus:
where gg is the determinant of the metric and ss is the number of minus signs in the signature of the metric. The (-1)^(s)(-1)^{s} factor ensures that we do not attempt to take the square root of a negative number.
Example 37.3
Consider a flat two-dimensional plane expressed in polar coordinates with line element ^(8)ds^(2)=dr^(2)+r^(2)dtheta^(2){ }^{8} \mathrm{~d} s^{2}=\mathrm{d} r^{2}+r^{2} \mathrm{~d} \theta^{2}. Expressed in orthonormal coordinates (x^(1),x^(2))=(r,theta)\left(x^{1}, x^{2}\right)=(r, \theta), we have basis 1 -forms
If, instead, we choose to work with in a coordinate basis with basis 1-forms omega^(r)=dr\boldsymbol{\omega}^{r}=\boldsymbol{d} r and omega^(theta)=d theta\boldsymbol{\omega}^{\theta}=\boldsymbol{d} \theta we must multiply dr^^d theta\boldsymbol{d} r \wedge \boldsymbol{d} \theta a factor of sqrtg=sqrt(detg_(ij))=r\sqrt{g}=\sqrt{\operatorname{det} g_{i j}}=r and obtain the same result for the volume form tilde(omega)\tilde{\boldsymbol{\omega}}. Note, however, that in the coordinate basis, the volume form has components sqrtgepsi_(ij)\sqrt{g} \varepsilon_{i j}, or
With the volume form tilde(omega)\tilde{\boldsymbol{\omega}} in hand, we can define the duality transformation using the Hodge star operation. In nn-dimensional space, this operation takes a (q,0)(q, 0) tensor and outputs a pp-form, where p=n-qp=n-q. Consider a (q,0)(q, 0) antisymmetric tensor T\boldsymbol{T} with components T^(alpha dots kappa)=T^{\alpha \ldots \kappa}=T^([alpha dots kappa])T^{[\alpha \ldots \kappa]}. The pp-form tilde(A)\tilde{\boldsymbol{A}} that is dual to T\boldsymbol{T} is written as
Despite the fussy definition, the practice of taking the dual turns out to be rather simple, especially in the usual (x^(0),dots,x^(n))\left(x^{0}, \ldots, x^{n}\right) Minkowski space. Here the volume form is given by dx^(0)^^dots^^dx^(n)\boldsymbol{d} x^{0} \wedge \ldots \wedge \boldsymbol{d} x^{n}, and has components given by the Levi-Civita symbol epsi_(0dots n)\varepsilon_{0 \ldots n}.
Example 37.4
In three-dimensional Euclidean space in Cartesian coordinates, we have the volume 3 -form tilde(omega)=(1)/(3!)epsi_(mu nu sigma)dx^(mu)^^dx^(nu)^^dx^(sigma)\tilde{\boldsymbol{\omega}}=\frac{1}{3!} \varepsilon_{\mu \nu \sigma} \boldsymbol{d} x^{\mu} \wedge \boldsymbol{d} x^{\nu} \wedge \boldsymbol{d} x^{\sigma} with components epsi_(mu nu sigma)\varepsilon_{\mu \nu \sigma}, where epsi_(123)=1\varepsilon_{123}=1.
Consider a bivector T=e_(2)^^e_(3)\boldsymbol{T}=\boldsymbol{e}_{2} \wedge \boldsymbol{e}_{3}, which has components T^(alpha beta)=T(omega^(alpha),omega^(beta))T^{\alpha \beta}=\boldsymbol{T}\left(\boldsymbol{\omega}^{\alpha}, \boldsymbol{\omega}^{\beta}\right) given by
This q=2q=2 vector is dual to an object with valence (0,p)=(0,3-2)=(0,1)(0, p)=(0,3-2)=(0,1), that is, a 1 -form tilde(A)=***T\tilde{\boldsymbol{A}}=\star \boldsymbol{T}. This 1 -form has components
^(9){ }^{9} This fussy sign convention is the one used in most general relativity texts and so we employ it here. It is useful since it allows us to raise the components of omega\omega with the metric as we expect to be able to do with the components of any tensor in a metric space.
with other components vanishing. We have, therefore, that tilde(A)=omega^(1)=dx^(1)\tilde{\boldsymbol{A}}=\boldsymbol{\omega}^{1}=\boldsymbol{d} x^{1}. In brief, ***(e_(y)^^e_(x))=dx\star\left(\boldsymbol{e}_{y} \wedge \boldsymbol{e}_{x}\right)=\boldsymbol{d} x in three dimensions.
In order to define the inverse map that inputs the pp-form and outputs a qq-tensor, we need the components omega^(alpha dots kappa)\omega^{\alpha \ldots \kappa} of a nn-vector version of the volume nn-form tilde(omega)\tilde{\boldsymbol{\omega}}. In a general metric space, these components are determined by the equation
where, once again, ss is the number of negative signs in the metric's signature. ^(9){ }^{9} This guarantees the normalization condition that omega^(123 dots n)=\omega^{123 \ldots n}=(-1)^(s)//omega_(123 dots n)(-1)^{s} / \omega_{123 \ldots n}.
Example 37.5
For an orthonormal basis with s=1s=1 we have that the components of the volume tensor are
and so omega^(123 dots n)omega_(123 dots n)=epsi^(alpha beta dots kappa)epsi_(alpha beta dots kappa)=-n\omega^{123 \ldots n} \omega_{123 \ldots n}=\varepsilon^{\alpha \beta \ldots \kappa} \varepsilon_{\alpha \beta \ldots \kappa}=-n !
In the usual (3+1)(3+1)-dimensional metric spacetime with s=1s=1, this gives us the relations
{:(37.25)omega_(mu nu alpha beta)=sqrt(-g)epsi_(mu nu alpha beta)","quadomega^(mu nu alpha beta)=(1)/(sqrt(-g))epsi^(mu nu alpha beta):}\begin{equation*}
\omega_{\mu \nu \alpha \beta}=\sqrt{-g} \varepsilon_{\mu \nu \alpha \beta}, \quad \omega^{\mu \nu \alpha \beta}=\frac{1}{\sqrt{-g}} \varepsilon^{\mu \nu \alpha \beta} \tag{37.25}
\end{equation*}
where epsi^(0123)=-epsi_(0123)=-1\varepsilon^{0123}=-\varepsilon_{0123}=-1.
In nn-dimensional space, the dual (q,0)(q, 0) tensor S\boldsymbol{S} of the pp-form tilde(B)\tilde{\boldsymbol{B}} is given by
In flat n=3n=3-dimensional space, we have the volume 3 -form tilde(omega)=dx^(1)^^dx^(2)^^dx^(3)\tilde{\boldsymbol{\omega}}=\boldsymbol{d} x^{1} \wedge \boldsymbol{d} x^{2} \wedge \boldsymbol{d} x^{3}, with components epsi_(mu nu sigma)\varepsilon_{\mu \nu \sigma}. The corresponding up version has components epsi^(mu nu sigma)\varepsilon^{\mu \nu \sigma}, with epsi^(123)=1\varepsilon^{123}=1. Consider the p=2p=2-form tilde(A)=dx^(2)^^dx^(3)\tilde{\boldsymbol{A}}=\boldsymbol{d} x^{2} \wedge \boldsymbol{d} x^{3}, which has components
This (p=2)(p=2)-form is dual to an object with valence (q,0)=(3-2,0)=(1,0)(q, 0)=(3-2,0)=(1,0), that is, a vector. The components of the vector S=*** tilde(A)\boldsymbol{S}=\star \tilde{\boldsymbol{A}} are
{:(37.29)S^(alpha)=(1)/(p!)omega^(mu nu i)A_(mu nu)=(1)/(2!)epsi^(mu nu alpha)(delta_(mu)^(2)delta_(nu)^(3)-delta_(mu)^(3)delta_(nu)^(2)):}\begin{equation*}
S^{\alpha}=\frac{1}{p!} \omega^{\mu \nu i} A_{\mu \nu}=\frac{1}{2!} \varepsilon^{\mu \nu \alpha}\left(\delta_{\mu}^{2} \delta_{\nu}^{3}-\delta_{\mu}^{3} \delta_{\nu}^{2}\right) \tag{37.29}
\end{equation*}
with other components vanishing. We have, therefore, S=e_(1)\boldsymbol{S}=\boldsymbol{e}_{1}. In brief, ***(dy^^dz)=\star(\boldsymbol{d} y \wedge \boldsymbol{d} z)=e_(x)e_{x} in three dimensions.
Most of our work will be in Minkowski space, similar to the next example.
Example 37.7
In n=4n=4-dimensional Minkowski space, we have the volume form tilde(omega)=dt^^dx^^dy^^dz\tilde{\boldsymbol{\omega}}=\boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z, with components epsi_(mu nu alpha beta)\varepsilon_{\mu \nu \alpha \beta}. A p=2p=2-form tilde(M)=Bdy^^dz\tilde{\boldsymbol{M}}=B \boldsymbol{d} y \wedge \boldsymbol{d} z has components
which is to say that M_(23)=-M_(32)=BM_{23}=-M_{32}=B and all other components vanish. The dual to M\boldsymbol{M} is a (2,0)(2,0) tensor ***M=M\star \boldsymbol{M}=\boldsymbol{M} whose components are
We can also see how ***(dy^^dz)=e_(x)^^e_(t)\star(\boldsymbol{d} y \wedge \boldsymbol{d} z)=\boldsymbol{e}_{x} \wedge \boldsymbol{e}_{t}.
Some useful examples of commonly encountered duals are collected in the margin.
Example 37.8
Most people are already familiar with a vector that results from taking a dual: the cross product. Consider the 2 -form tilde(u)^^ tilde(v)\tilde{\boldsymbol{u}} \wedge \tilde{\boldsymbol{v}} in Euclidean n=3n=3-dimensional space. It has components w_(ij)=u_(i)v_(j)-u_(j)v_(i)w_{i j}=u_{i} v_{j}-u_{j} v_{i}. Its dual is a q=1q=1 vector tt, whose first component is
That is, a dual of a dual is, up to a sign, the tensor we started with.
37.3 Volume forms
After that interlude into the abstract mathematics of the mappings between forms and vectors, we turn to the business of calculating areas and volumes in four-dimensional space. This can be done quite simply using the volume tensor discussed in the previous section.
In Chapter 32, we considered n=2n=2-dimensional Euclidean space and the case shown in Fig. 37.1, where the area of the parallelogram is given by
{:(37.42)(" Area ")=|[u^(x),u^(y)],[v^(x),v^(y)]|=u^(x)v^(y)-v^(x)u^(y)=epsi_(mu nu)u^(mu)v^(nu):}(\text { Area })=\left|\begin{array}{cc}
u^{x} & u^{y} \tag{37.42}\\
v^{x} & v^{y}
\end{array}\right|=u^{x} v^{y}-v^{x} u^{y}=\varepsilon_{\mu \nu} u^{\mu} v^{\nu}
There are two interesting observations to make about this. The first is that the area results from inserting vectors u\boldsymbol{u} and v\boldsymbol{v} into a twodimensional volume tensor tilde(omega)()=,dx^^dy\tilde{\boldsymbol{\omega}}()=,\boldsymbol{d} x \wedge \boldsymbol{d} y, which has components omega_(ij)=epsi_(ij)\omega_{i j}=\varepsilon_{i j}. That is
{:(37.43)(" Area ")= tilde(omega)(u","v):}\begin{equation*}
(\text { Area })=\tilde{\boldsymbol{\omega}}(\boldsymbol{u}, \boldsymbol{v}) \tag{37.43}
\end{equation*}
The second is that, since the dual of a bivector is a number for n=2n=2, we can link the area (a number) to the (2,0) bivector b=u^^v\boldsymbol{b}=\boldsymbol{u} \wedge \boldsymbol{v}
{:(37.44)(" Area ")=***b=***(u^^v):}\begin{equation*}
(\text { Area })=\star \boldsymbol{b}=\star(\boldsymbol{u} \wedge \boldsymbol{v}) \tag{37.44}
\end{equation*}
We demonstrate this latter feature in the next example.
Example 37.9
We are working in (Euclidean) n=2n=2 space and so a dual of a (2,0)(2,0) tensor gives us a 0 -form (or number). The bivector has components b^(mu nu)=u^(mu)v^(nu)-v^(mu)u^(nu)b^{\mu \nu}=u^{\mu} v^{\nu}-v^{\mu} u^{\nu}, and so the dual AA has components
which is the expression for the area we had above.
We conclude that the volume of the parallelogram in n=2n=2 space (also known as the area) is the dual of the bivector formed from its sides.
In the same way, a 4 -volume V\mathcal{V} in (3+1)-dimensional spacetime can be determined by filling in slots in a volume 4 -form. As might be expected tilde(omega)(A,B,C,D)\tilde{\boldsymbol{\omega}}(\boldsymbol{A}, \boldsymbol{B}, \boldsymbol{C}, \boldsymbol{D}) outputs the 4 -volume V\mathcal{V} of a four-dimensional
parallelepiped with sides formed from the 4 -vectors A,B,C\boldsymbol{A}, \boldsymbol{B}, \boldsymbol{C} and D\boldsymbol{D}. In terms of components in Minkowski spacetime, this is written as
For an infinitesimal box with sides described by vectors dx^(0)e_(0),dx^(1)e_(1),dx^(2)e_(2)\mathrm{d} x^{0} \boldsymbol{e}_{0}, \mathrm{~d} x^{1} \boldsymbol{e}_{1}, \mathrm{~d} x^{2} \boldsymbol{e}_{2} and dx^(3)e_(3)\mathrm{d} x^{3} e_{3}, we have an infinitesimal 4 -volume
This means that an infinitesimal volume element (useful for integration) should be written as dV=sqrt(-g)d^(4)x\mathrm{d} \mathcal{V}=\sqrt{-g} \mathrm{~d}^{4} x, where d^(4)x\mathrm{d}^{4} x is the usual Cartesian flat-space volume element dx^(0)dx^(1)dx^(2)dx^(3)\mathrm{d} x^{0} \mathrm{~d} x^{1} \mathrm{~d} x^{2} \mathrm{~d} x^{3}.
We also see that we have access to the same volume V\mathcal{V} by taking the dual of the tetravector (A^^B^^C^^D)(\boldsymbol{A} \wedge \boldsymbol{B} \wedge \boldsymbol{C} \wedge \boldsymbol{D}), which is to say
{:(37.49)V=***(A^^B^^C^^D):}\begin{equation*}
\mathcal{V}=\star(A \wedge B \wedge C \wedge D) \tag{37.49}
\end{equation*}
Another useful object is the 3 -volume. ^(10){ }^{10} Still working in (3+1)dimensional spacetime, this quantity can be represented using a dual of a trivector by defining a 3 -volume 1\mathbf{1}-form tilde(sigma)\tilde{\boldsymbol{\sigma}}. Noting that the dual of a (3,0) tensor A^^B^^C\boldsymbol{A} \wedge \boldsymbol{B} \wedge \boldsymbol{C} is a 1 -form in four-dimensional spacetime, we define tilde(sigma)\tilde{\boldsymbol{\sigma}} to be given by
{:(37.50) tilde(sigma)=-***(A^^B^^C):}\begin{equation*}
\tilde{\sigma}=-\star(A \wedge B \wedge C) \tag{37.50}
\end{equation*}
Why define it like this? It's because this fits nicely with our use of the volume 4 -form. In fact, filling in three slots of the volume 4 -form gives us the 3 -volume 1 -form:
Consider Minkowski spacetime. A cuboid box has sides parallel to the spacelike basis vectors e_(i)\boldsymbol{e}_{i}, with lengths delta x,delta y\delta x, \delta y and delta z\delta z. The (spacelike) 3 -volume of the box in its rest frame is given by
{:(37.53)sigma_(0)= tilde(omega)(e_(0),deltax^(1)e_(1),deltax^(2)e_(2),deltax^(3)e_(3))=epsi_(0123)deltax^(1)deltax^(2)deltax^(3)=delta x delta y delta z:}\begin{equation*}
\sigma_{0}=\tilde{\boldsymbol{\omega}}\left(\boldsymbol{e}_{0}, \delta x^{1} \boldsymbol{e}_{1}, \delta x^{2} \boldsymbol{e}_{2}, \delta x^{3} e_{3}\right)=\varepsilon_{0123} \delta x^{1} \delta x^{2} \delta x^{3}=\delta x \delta y \delta z \tag{37.53}
\end{equation*}
which is the expected answer for a 3 -volume of a box. All of the other components of tilde(sigma)\tilde{\boldsymbol{\sigma}} vanish. ^(10){ }^{10} The 3-volume of an object in (3+1)dimensional space can be thought of as a spacelike hypersurface. That is, a slice of four-dimensional spacetime at a constant time. The normal to a spacelike hypersurface is timelike. We can like hypersurface is timelike. We can
also identify 3 -volumes that are timealso identify 3 -volumes that are time-
like hypersurfaces, which are slices of like hypersurfaces, which are slices of
spacetime taken at a constant value of spacetime taken at a constant value of
one spatial coordinate. These are defined by their normals being spacelike. ^(11){ }^{11} Recall that an observer with velocity u\boldsymbol{u} measures an energy E=- tilde(p)(u)E=-\tilde{\boldsymbol{p}}(\boldsymbol{u}) for a particle with momentum p\boldsymbol{p} a number density of particles n=- tilde(J)(u)n=-\tilde{\boldsymbol{J}}(\boldsymbol{u}). We can add that they measure a 3 -volume V= tilde(sigma)(u)V=\tilde{\boldsymbol{\sigma}}(\boldsymbol{u}). ^(12){ }^{12} One way to think of this sign is that it reflects the fact that if a surface moves in some direction and passes through dust particles, then in the rest frame of the surface, the flux of dust particles is directed towards the surface. Since we define the positive sense of flux through a surface as directed being outwards, a surface as directed being outwards this gives us a minus sign. Another
way to think about this is to characterize the three-dimensional hypersurfaces ize the three-dimensional hypersurfaces
using the direction of their normal, simusing the direction of their normal, sim-
ilar to the way we write d vec(S)= vec(n)dS\mathrm{d} \vec{S}=\vec{n} \mathrm{~d} S for ilar to the way we write d vec(S)= vec(n)dS\mathrm{d} \vec{S}=\vec{n} \mathrm{~d} S for
a two-dimensional surface. From our a two-dimensional surface. From our
definitions the signs are accounted for if we have that spacelike hypersurfaces have outward-directed (timelike) normals, while timelike hypersurfaces have inward directed ones (spacelike) normals. This becomes important when choosing the orientation of a surface over which to integrate in later chapters.
In the local rest frame of an observer, their velocity vector u\boldsymbol{u} is taken to be equal to e_( hat(0))\boldsymbol{e}_{\hat{0}}. As a result, we can say that the 3 -volume VV of a parallelepiped with sides A,B\boldsymbol{A}, \boldsymbol{B} and C\boldsymbol{C} measured by an observer with velocity u\boldsymbol{u} is ^(11){ }^{11}
While the 0th component of tilde(sigma)\tilde{\boldsymbol{\sigma}} gives the three-volume of a spacelike surface, the other components tell us about the 3 -volume of timelike surfaces.
Example 37.12
Again in Minkowski space, consider that spatial surface of our cuboid box with lengths delta ye_(2)\delta y e_{2} and delta ze_(3)\delta z e_{3}. When moving at a velocity u\boldsymbol{u}, in a proper time delta tau\delta \tau the surface sweeps out a volume whose third edge is u delta tau\boldsymbol{u} \delta \tau. We then write a 3 -volume 1-form
{:(37.55) tilde(sigma)()= tilde(omega)(,delta ye_(2),delta ze_(3),u delta tau):}\begin{equation*}
\tilde{\boldsymbol{\sigma}}()=\tilde{\boldsymbol{\omega}}\left(, \delta y \boldsymbol{e}_{2}, \delta z \boldsymbol{e}_{3}, \boldsymbol{u} \delta \tau\right) \tag{37.55}
\end{equation*}
In the rest frame of the box, we then have u=e_(0)\boldsymbol{u}=\boldsymbol{e}_{0} and so the components of tilde(sigma)()\tilde{\boldsymbol{\sigma}}() in this frame are given by
{:[sigma_(0)= tilde(omega)(e_(0),delta ye_(2),delta ze_(3),e_(0)delta tau)],[sigma_(1)= tilde(omega)(e_(1),delta ye_(2),delta ze_(3),e_(0)delta tau)],[sigma_(2)= tilde(omega)(e_(2),delta ye_(2),delta ze_(3),e_(0)delta tau)]:}=0,\begin{aligned}
\sigma_{0} & =\tilde{\boldsymbol{\omega}}\left(e_{0}, \delta y \boldsymbol{e}_{2}, \delta z e_{3}, e_{0} \delta \tau\right) \\
\sigma_{1} & =\tilde{\boldsymbol{\omega}}\left(\boldsymbol{e}_{1}, \delta y \boldsymbol{e}_{2}, \delta z \boldsymbol{e}_{3}, \boldsymbol{e}_{0} \delta \tau\right) \\
\sigma_{2} & =\tilde{\boldsymbol{\omega}}\left(\boldsymbol{e}_{2}, \delta y \boldsymbol{e}_{2}, \delta z e_{3}, \boldsymbol{e}_{0} \delta \tau\right)
\end{aligned}=0,
{:(37.56)sigma_(3)= tilde(omega)(e_(3),delta ye_(2),delta ze_(3),e_(0)delta tau)=0:}\begin{equation*}
\sigma_{3}=\tilde{\omega}\left(e_{3}, \delta y e_{2}, \delta z e_{3}, e_{0} \delta \tau\right)=0 \tag{37.56}
\end{equation*}
We conclude that sigma_(1)=-delta y delta z delta tau\sigma_{1}=-\delta y \delta z \delta \tau, and the other components vanish. The minus sign might be slightly unexpected since we have chosen the surface using the usual right-handed conventions, but it follows from the fact that the surface is at rest in the frame in which we've worked out the components. ^(12){ }^{12}
The volume form is often used to compute volume elements as examined in the next example.
Example 37.13
Working in a coordinate frame we have
For an infinitesimal box with sides described in its rest frame by spacelike vectors dx^(1)e_(1),dx^(2)e_(2)\mathrm{d} x^{1} \boldsymbol{e}_{1}, \mathrm{~d} x^{2} \boldsymbol{e}_{2} and dx^(3)e_(3)\mathrm{d} x^{3} \boldsymbol{e}_{3} we have a 3 -volume 1-form integration element
^(13){ }^{13} Recall for an observer at rest in a leading us to conclude that the invariant proper 3 -volume element is ^(13)dV={ }^{13} \mathrm{~d} V= spacetime with a diagonal metric we have u^(0)=(-g_(tt))^(-(1)/(2))u^{0}=\left(-g_{t t}\right)^{-\frac{1}{2}}, showing how the timelike component of the metric is effectively divided out of the determinant. sqrt(-g)u^(0)dx^(1)dx^(2)dx^(3)\sqrt{-g} u^{0} \mathrm{~d} x^{1} \mathrm{~d} x^{2} \mathrm{~d} x^{3}.
The components dsigma_(mu)\mathrm{d} \sigma_{\mu} of the infinitesimal 3-volume 1 -form will always be contracted against the components of a vector in expressions like J^(mu)dsigma_(mu)=d tilde(sigma)(J)J^{\mu} \mathrm{d} \sigma_{\mu}=\mathrm{d} \tilde{\boldsymbol{\sigma}}(\boldsymbol{J}).
Example 37.14
If the integration surface is the infinitesimal box in 3-space from above, we have a contribution
which computes the amount of the timelike (i.e. charge-like) component of vector J\boldsymbol{J} in the box. A different choice of the three-dimensional integration surface might lead to a non-zero contribution like
{:(37.61)J^(1)dsigma_(1)=sqrt(-g)J^(1)epsi_(1230)dx^(2)dx^(3)dx^(0)=-sqrt(-g)J^(1)dydzdt.:}\begin{equation*}
J^{1} \mathrm{~d} \sigma_{1}=\sqrt{-g} J^{1} \varepsilon_{1230} \mathrm{~d} x^{2} \mathrm{~d} x^{3} \mathrm{~d} x^{0}=-\sqrt{-g} J^{1} \mathrm{~d} y \mathrm{~d} z \mathrm{~d} t . \tag{37.61}
\end{equation*}
which computes the flux of the J^(1)J^{1} component of the current though the face of a surface with sides parallel to e_(2)\boldsymbol{e}_{2} and e_(3)\boldsymbol{e}_{3} in coordinate time dt\mathrm{d} t.
The material in this chapter will be used most extensively in Chapters 42 and 43 . In the next chapter, we continue to examine the mathematics of forms and turn to the question of how they give rise to a new insight into what an integral is.
Chapter summary
The Hodge star operation allows us to map between forms and antisymmetric tensors built from vectors.
Duality allows us to compute volumes using the volume tensor tilde(omega)\tilde{\boldsymbol{\omega}}. This is a 4 -form in our usual ( 3+13+1 )-dimensional spacetime.
The 3 -volume of a box with sides described by vectors A,B\boldsymbol{A}, \boldsymbol{B} and C\boldsymbol{C}, measured by an observer with velocity u\boldsymbol{u}, is given by tilde(sigma)(u)=\tilde{\boldsymbol{\sigma}}(\boldsymbol{u})=tilde(omega)(u,A,B,C)\tilde{\boldsymbol{\omega}}(\boldsymbol{u}, \boldsymbol{A}, \boldsymbol{B}, \boldsymbol{C}).
Exercises
(37.1) Consider the volume 1 -form tilde(sigma)\tilde{\boldsymbol{\sigma}} describing the 3 volume of a box. Insert this into the (2,0)(2,0) energymomentum tensor T\boldsymbol{T} and we have that momentum p\boldsymbol{p} is given via
{:(37.62)T("," tilde(sigma))=((" momentum crossing from ")/(" negative to positive side of box ")):}\begin{equation*}
\boldsymbol{T}(, \tilde{\boldsymbol{\sigma}})=\binom{\text { momentum crossing from }}{\text { negative to positive side of box }} \tag{37.62}
\end{equation*}
An observer with velocity u\boldsymbol{u} carrying the box measures its 3 -volume as VV.
(a) Show that the volume 1 -form tilde(sigma)\tilde{\boldsymbol{\sigma}} is related to the velocity 1 -form by tilde(sigma)=-V tilde(u)\tilde{\boldsymbol{\sigma}}=-V \tilde{\boldsymbol{u}}.
Use this result to find expressions for: (b) the momentum and (c) the total energy, in terms of VV and the components of T\boldsymbol{T} and tilde(u)\tilde{\boldsymbol{u}}.
(37.2) The observer with velocity vector u\boldsymbol{u} carries a box with sides A,B\boldsymbol{A}, \boldsymbol{B} and C\boldsymbol{C} which has permeable walls, allowing a particle current J\boldsymbol{J} to enter it.
(a) Explain why the observer can resolve the current into the form
(37.64)
(c) Now consider a second observer with velocity v\boldsymbol{v} carrying a box with sides A^('),B^(')\boldsymbol{A}^{\prime}, \boldsymbol{B}^{\prime} and C^(')\boldsymbol{C}^{\prime}. What condition is necessary to ensure the boxes capture the same number of particles?
(37.3) (a) In a three-dimensional coordinate system with a metric, show that the first component of the curl of a vector v\boldsymbol{v} can be written as
Hint: This proof, given in the 1980 book by Schutz, relies on manipulating the indices of the LeviCivita symbol. If in doubt, prove the first identity for n=4n=4 and p=2p=2 or similar, and then generalize. Once the first identity is proven, the second identity follows using very similar arguments.
(37.5) The Lie derivative gives access to a general method for computing the divergence theta_(v)\theta_{v} of a field v\boldsymbol{v}. The divergence arises when we take the Lie derivative of the volume tensor tilde(omega)=epsi_(|alpha beta gamma delta|)dx^(alpha)^^dx^(beta)^^dx^(gamma)^^\tilde{\boldsymbol{\omega}}=\varepsilon_{|\alpha \beta \gamma \delta|} \boldsymbol{d} x^{\alpha} \wedge \boldsymbol{d} x^{\beta} \wedge \boldsymbol{d} x^{\gamma} \wedgedx^(delta)\boldsymbol{d} x^{\delta}. We define the divergence via
(37.6) Consider a uniform distribution of dust. We have a congruence of curves formed from the world lines of dust particles with a corresponding current (tangent) field J\boldsymbol{J}. Consider again an observer with velocity u\boldsymbol{u} carrying a permeable box of volume VV spanned by vectors A,B\boldsymbol{A}, \boldsymbol{B} and C\boldsymbol{C}.
(a) Explain why
(b) Show that for chi= tilde(omega)(J,A,B,C)\chi=\tilde{\boldsymbol{\omega}}(\boldsymbol{J}, \boldsymbol{A}, \boldsymbol{B}, \boldsymbol{C}), where tilde(omega)\tilde{\boldsymbol{\omega}} is the volume 4 -form, we have